Applications of AI in policing

Artificial Intelligence (AI) is used so frequently now in technology discussions it has come to mean anything and everything. Many discussions are somewhat oriented around ChatGPT and similar generative AI tools, which are vastly different than many other AI applications in law enforcement. I wrote this blog post to those who are a bit bewildered by the current buzz as a plain overview of current technology, its limitations, and potential future directions in policing around the tech.

I provide consulting services (statistical analysis and coding/tech help), but I am not selling any specific product here. AI can (and should) be used for good in policing, but agencies need to be aware of snake oil in the current market, as well as have reasonable expectations about what is (and is not) possible.

Generative AI ChatBots

Generative AI tools, like ChatGPT, take as input prompts and generate responses. Input prompts are mostly text, but can be different inputs (like images, video, or audio). It is a simplification, but for the text based large language models (LLMs), they use historical text to build a model to predict future text. So if you put in the text as a prompt “Jack and Jill went up a hill, what were they doing?”, it will return the following part of the rhyme “to fetch a pail of water”.

For a slight detour, I do not personally consider the current chatbot tools intelligent. It is a regular belief held among computer scientists that if you build a tool to generate text (or answer questions) well enough, it can be considered artificial intelligence. The computer science definition of artificial intelligence is very unlikely the same as the general lay interpretation, and this causes people to mistake what the current tools are really capable of. ChatBots can seem lifelike, but they are not. They are simply calculators, but instead of putting in numbers you input text. The ChatBot will return “to fetch a pail of water” not because it understands the question in a way that we humans understand things, but because it has seen similar text to that input text in the data it was trained on.

Even if not intelligent though, they can still be useful. One example application is as a coding assistant. For example I can ask ChatGPT “I want to write a function in python to query a table”, and it will reply with an answer containing python code examples. In practice this is very similar to auto-complete, you can have the applications generate short code examples. I personally use it in a not dissimilar way than using Google to help search for code examples or documentation. This can be extended to be a smart search tool on your local data using strategies like semantic search and retrieval augmented generation in combination with the language models.

For image or video input, you can also ask it to describe charts (for say making automated reports with natural language), or ask it to identify text in a video. Note these bespoke applications however will not have any guarantees as to their accuracy.

This functionality however will not replace computer programmers or analysts. It can do very well defined small tasks, it cannot do larger, ambiguous projects that take much discretion. It will likely help beginner coders write faster, same as auto-complete and spell check help people write faster. Which is a very good thing but not fundamentally ground braking. Any company that sells “AI agents” to replace analysts are almost all bullshit currently on the market. I have not seen any that are better than paying for ChatGPT, Sonnet, or Gemini license. And law enforcement agencies need to be extra careful that these tools are deployed in a way that does not leak sensitive information.

One last example that I think is very interesting, is Axon’s use case of using audio from body worn cameras to fill in police reports. I do not have any special insider knowledge, but what I believe Axon is doing is that the BWC footage is transcribed, and then that text is fed through a system to generate a summary of the incident. This I think has great promise, but I think any police department using this tech needs to do their own due diligence to make sure it is not causing errors. The concerns that this application will make mistakes is inevitable, it is a question of how often the mistakes occur, and having a reasonable review system in place to try to identify those mistakes when they happen.

One tech secret – sample demos can look very cool, but they are easy to fake (remember Theranos). I think most of the new technology PDs should do some local pilot testing before rolling out more broadly. This is true for any application, but the LLM based apps that promise to do everything to me are at a point in the hype cycle they should garner a higher level of scrutiny from law enforcement agencies.

Machine Learning for visual and audio input

Although most of the recent buzz around AI is talking about the LLM applications, the word has been used indiscriminately for applications that have a much longer history. These include supervised learning models, such as spatial or person based predictive policing, image applications (such a license plate readers or facial recognition), or other natural language processing tools, such as named entity recognition.

So although the prior models I discussed do have image input applications, these are different than a model trained specifically to read license plates. For specialized applications, a specially tailored model will almost always be more accurate than the GenAI model that is trained to do multiple tasks. So although you could use Google’s gemini to identify license plates, in practice this does not make sense. The specially trained models will be both more accurate and have estimates as to their accuracy.

These applications need to be evaluated individually. For example, the biggest issue with facial recognition is the false positive rate. So if you take an image, and search through a database of 10 million images, if the false positive rate is 1 in a million (a very accurate model), even if there should be no matches the machine will on average return 10 false matches. (Facial recognition due to this I do not think will ever be the same level of standard as say DNA in a rape case.) I think it is possible to use them to help investigations though, and likely in the near future such systems will be built into CCTV set ups. So you can search for a suspect in a black hat and identify recent footage that meets that description.

Although less in the news, I think there is strong potential for training audio/visual models to evaluate Body Worn Camera footage for police officers as well. Transcribing audio is mostly a solved problem (although diarization of the unique voices is still a very hard problem). Current startups Polis and Truleo from the samples I have seen I think can realistically be used to help police departments flag problematic interactions (so they do not need to review everyone), and perhaps even identify good behavior more generally.

The last audio example for audio input that is common in policing now are acoustic gunshot detection systems. Again these systems need to be evaluated on their own (if you work with Polis and think their system does well, it has no bearing on whether ShotSpotter works well). While I cannot say the accuracy of different acoustic gun shot technology, it has not resulted in clear public safety benefits in various independent evaluations.

Machine Learning on Administrative Data

Machine learning applications on administrative data I have as a final category. The necessary time and effort to replicate something like the audio or visual inputs for a local police department are too great for them to seriously consider implementing their own models. Machine learning on administrative data though is within the ken of most advanced crime analysts, and easily doable with whatever compute you are using to read this blog. That is if your analysis can make a hotspot map, they can probably learn to build a predictive policing application not all that different from PredPol.

I’ve mentioned two of these applications, person based chronic offender predictions, as well as place based spatial predictions. Another common application though is Early Intervention Systems. These systems are very similar to chronic offender based predictions, they are just predicting whether police officers have problematic behavior.

These models take administrative data police departments already have access to, and build predictive models to help with operations.

One thing to be clear with these models however is what exactly are they predicting. For example, I have seen some vendors for early intervention systems claim they can identify officers at high risk of having mental health problems. It is a noble goal, but not true. The system merely are models for measured behavior. You can identify if an officer is using force more often than they should, you cannot peak into their mind and know if they are depressed though.

How to handle public criticism

All of these different applications come with the risk of public scrutiny. I am of the opinion that much of the public discourse around AI in policing is pretty silly. Both pro arguments that do not critically evaluate whether the tech is effective given its fiscal costs, or those who are blanket against the tech for civil rights concerns even when they are unfounded.

Good use of AI in policing will result in both more efficient and fair application of justice. It is in the best interest of police departments to both be more proactive in describing to the public how these tools are really used in practice (knowing how license plate readers work does not diminish their effectiveness), as well as taking the time to evaluate them to see if they are effective.

If you are considering purchase of technology, or you wish to have an outside evaluator, feel free to get in touch. AI is not magic, and police departments should be well informed before making large investments in tech.