AI algorithms such as facial recognition systems produce probabilities, not facts. Matthew Horwood/Getty Images

How AI Can Lead To False Arrests And Wrongful Convictions

2026-05-11 09:34:58

(MENAFN- The Conversation) In Baltimore on Oct. 20, 2025, a 17-year-old student named Taki Allen was sitting outside his high school after football practice when an artificial intelligence-enhanced surveillance camera falsely identified the Doritos bag in his pocket as a gun. Within moments police cars arrived, officers drew their weapons and Allen was forced to his knees and handcuffed while they searched him. All they found was a crumpled bag of chips. The AI's misidentification and the human decisions that followed turned a normal evening into a traumatic confrontation.

On Dec. 24, 2025, Angela Lipps, a Tennessee grandmother, was released after spending five months in jail because facial recognition software had incorrectly connected her to fraud crimes in North Dakota, a state she had never visited. Police had arrested her at gunpoint while she was babysitting her four grandchildren.

These are unfortunate examples of how AI can lead to mistreatment of people because of technical flaws as well as misplaced human faith in the technology's supposed objectivity. These cases involve different tools, but the underlying issue is the same. AI systems produce probabilities, and people treat them as certainties.

We are researchers who study the intersection of technology, law and public administration. In researching how police departments use AI and how digital technologies operate in a democratic society, we have seen how quickly the shift from probabilistic prediction to operational certainty happens in practice.

AI policing tools are used in dozens of U.S. cities, although no public registry tracks the full footprint. The tools ingest historical crime data and score neighborhoods on predicted risk so officers can be routed toward the resulting hot spots. The mechanism is straightforward, but its consequence is not. Once a system signals a possible threat, the question is no longer how certain the prediction is but what to do about it. A statistical output turns into a deployment decision, and the uncertainty that produced it gets lost on the way.

A matter of probabilities

When generative AI models such as ChatGPT or Claude respond to human requests, they are not searching a database and pulling out facts. They are predicting the most likely answer based on patterns in data they have been trained on. When asked,“Who invented the light bulb?” the models do not go to a source or fact-check a finding. They generate a statistically probable answer which is“Thomas Edison.” The reply might be right, but it might not capture the full story – such as Joseph Swan's parallel invention at the same time as Edison's. The danger arises when people believe that the model is retrieving truth rather than generating likelihoods.

This distinction matters. The most probable response is not the same as a factually verified answer, complete with context.

This reality can be highly problematic for policing and law. For example, when law enforcement agencies use AI systems trained on geographical data to estimate where criminal activity is likely to occur, the algorithms analyze historical crime data and geographic patterns. These systems generate statistical risk scores or heat maps for locations based on prior incidents. But such predictions may have little bearing on who was involved in a new crime in the area, even if an algorithm generates information that sounds authoritative.

Some researchers have argued that predictive policing systems do not increase the likelihood that racial minorities will be arrested more often relative to traditional policing practices. The broader concern, however, is not limited to measurable disparities in arrest outcomes alone. It is about how probabilistic predictions can become standardized operational decisions absent further verification.

Artificial intelligence researchers caution against using these models in isolation for crime and legal proceedings or decision-making. Research at the University of Virginia's Digital Technology for Democracy Lab with police chiefs shows that some law enforcement groups follow strict policies that dictate when technology is used in tandem with, or in place of, human discretion, while others have no such policy.

What most users do not realize is that AI systems rarely produce binary answers: yes or no, a positive identification or a negative one. They generate probabilities. Some systems assign scores that assess the system's confidence in a prediction. In those cases, engineers set a confidence threshold, a level of certainty that determines when the system should trigger an alert about a possible threat. You can think of this threshold as settings on a control knob. A 95% confidence level, for example, indicates that the model considers its interpretation to be highly likely.

A low threshold catches more potential threats but increases false alarms. A high threshold reduces mistakes but risks missing real dangers. Either way, these algorithmic thresholds are often invisible to the public and are set quietly by vendors or agencies, even though they shape when police action begins.

Where to draw the line

In medicine, these kinds of trade-offs are explicit. Diagnostic tools are calibrated on the relative harm of different errors. In infectious disease settings, for instance, systems that detect infections are often designed to accept more false positives to avoid missing contagious individuals. Then medical professionals look into the human cases. And the algorithm-based decisions are subject to professional standards, ethics reviews and regulatory oversight.

In policing, an AI system must balance false positives, where the system flags a threat that does not exist, and false negatives, where it fails to detect a real danger. The trade-off carries significant consequences. A lower threshold may generate more alerts and allow officers to intervene earlier, but it also increases the risk of mistaken identifications, which happened to Angela Lipps, or escalated encounters like the one Taki Allen experienced. A higher threshold may reduce wrongful interventions but could allow legitimate threats to go undetected.

Some law enforcement agencies argue that acting on imperfect signals is preferable to missing serious risks. But lowering the bar for algorithmic alerts based on probabilistic estimates effectively expands the number of people subjected to police attention. It is important to realize that these thresholds are not neutral features of the technology; they are choices embedded by the creators in the model's code. Decisions about where to draw the line determine when an algorithmic suspicion becomes a real-world police action, even though the public rarely sees or debates how those thresholds are set.

Limits of optimization

Developers often use several methods to determine where to set a confidence threshold. Techniques such as“receiver operating characteristic curve analysis” examine how changing the threshold for an alert alters the balance between correctly identifying real events and mistakenly flagging harmless ones. Precision–recall analysis examines a similar trade-off, asking how accurate the system's alerts are relative to the number of incidents it successfully detects.

These approaches could help calibrate systems more responsibly by testing how often an algorithm wrongly flags people or locations. Fine-tuning can improve system performance. But the techniques cannot resolve the underlying question of how much algorithmic uncertainty society is willing to tolerate.

In law, legal standards of proof determine how convincing evidence must be before a judge or jury can rule in favor of a plaintiff or defendant. Courts use formal standards of proof depending on the stakes, such as probable cause, preponderance of the evidence and beyond a reasonable doubt. These standards reflect a societal judgment about how much uncertainty is acceptable before exercising legal authority. A court does not accept a guess or a prediction; it follows a process to weigh evidence. Unlike humans, an AI model does not usually say,“I'm not sure.” A model typically has confidence in its reply, even when the answer is incorrect.

Stakes are rising as AI enters the courtroom, law enforcement, the classroom, the doctor's office and the public sector. It is important for people to understand that AI does not know things the way many assume it does. It does not distinguish between“maybe” and“definitely.” That is up to us. We believe that technologists should design systems that admit uncertainty and need to educate users about how to interpret AI outputs responsibly.

MENAFN11052026000199003603ID1111097288

Institution:University of Virginia

Legal Disclaimer:
MENAFN provides the information “as is” without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the provider above.