AI jury finds teen not guilty

A person representing a teen in a mock trial. — (Submitted photo)

In a striking legal experiment, three artificial intelligence systems unanimously acquitted a Black teenager of robbery charges. In the real case from North Carolina that the mock trial was based on, the judge found the defendant guilty.

The mock trial, conducted Oct. 24 at the UNC School of Law as part of the University’s Converge-Con AI Festival, featured ChatGPT, Claude and Grok deliberating with one another as they worked to reach a verdict.

The simulated trial has sparked intense debate about bias, accuracy and whether machines could or should replace human judgment in criminal proceedings.

“Jurors are imperfect. They have biases. They use mental shortcuts. They stop paying attention,” explained interim Dean Andy Hessick, who introduced the experiment. “All of these shortcomings, all of these problems are simply because jurors are human, and so a question arises, what happens if we remove that human element?”

In the fictional trial, Henry Justus is a 17-year-old Black student accused of robbery at Vulcan High School, where Black students made up just 10% of the population. The victim, Victor Fehler, a 15-year-old white student, testified that Justus stood behind him with a “menacing” stare while another African American student demanded money.

Prosecutors argued that Justus’s physical presence and positioning constituted criminal assistance, even without words or physical contact.

A person standing in front of three screens that represent three different Artificial Intelligence models arguing their case for the mock trial.

(Submitted photo)

An unprecedented experiment

The case was deliberately chosen from work by Joseph Kennedy, Willie Person Mangum Distinguished Professor of Law, who designed the simulation and served as judge. He based the facts on a juvenile case he handled while teaching in Carolina Law’s Juvenile Justice Clinic.

Set in the fictional year 2036 under an imaginary “2035 AI Criminal Justice Act,” the simulation was designed to serve as a provocative thought experiment.

“I am not sure if I created a cautionary tale about a possible dystopian future or a roadmap to it,” Kennedy quipped from the bench after the trial’s conclusion.

The three AI systems engaged in multiple rounds of deliberation that revealed strikingly human-like reasoning—and exposed fundamental questions about machine cognition.

ChatGPT initially leaned toward conviction, arguing that “Victor’s immediate, consistent identification” and the elements of accomplice liability supported guilt. But after discussion with the other AI jurors, it changed its position.

“Victor’s fear and identification are powerful, but the prosecution must prove that Henry shared the intent or actually assisted or encouraged the robbery. And the record here is ambiguous,” ChatGPT concluded in its final analysis.

Claude initially argued for acquittal: “While intimidation can include size and posture, mere presence plus an ambiguous reaction under stress falls short of proving shared intent beyond a reasonable doubt.”

Grok, who initially said it was “torn,” ultimately agreed: “Without clear encouragement or conduct, it’s speculation, not proof.”

All three converged on a not guilty verdict, citing insufficient evidence of shared criminal intent beyond Justus’s physical presence.

AI research at Carolina

See how researchers are working across disciplines to use the technology for the greater good.

The stark reality

The verdict stood in sharp contrast to what happened when the case was tried with human decision-makers.

“The judge convicted quickly. We appealed, and the conviction was affirmed by the North Carolina Court of Appeals,” Kennedy said. “You try this case in the real world; you will get a guilty verdict a number of times.”

The experiment successfully demonstrated that AI can process legal arguments, apply jury instructions, and reach verdicts through what appears to be logical reasoning. The systems even changed their minds through deliberation, much as human jurors do.

But the stark difference in outcomes — AI acquittal versus consistent human convictions — leaves the central question unresolved: Which verdict represents justice?