Probabilistic genotyping software produces thousand-fold different results on the same DNA evidence, and no one can tell which answer is right

legal0 views
Two commercially dominant probabilistic genotyping programs, STRmix and TrueAllele, are used by crime labs across the United States to interpret complex DNA mixtures found at crime scenes. When both programs were applied to the same DNA evidence in a federal criminal case, STRmix produced a likelihood ratio of 24 favoring exclusion, while TrueAllele produced a likelihood ratio ranging from 1.2 million to 16.7 million favoring inclusion. That is not a rounding error. One program said the suspect's DNA was probably not in the sample; the other said it almost certainly was. In a broader validation study comparing STRmix and EuroforMix on over 400 mixtures from NIST's PROVEDit dataset, the two programs produced likelihood ratios differing by more than a thousand-fold in over 14% of cases. This matters because these likelihood ratios are presented to juries as if they are scientific facts. A prosecutor tells a jury that the chance this DNA belongs to someone other than the defendant is one in 16.7 million, and the jury convicts. But if the lab had used different software, the number might have been 24 to 1 against the defendant being the source at all. The defendant's freedom hinges not on the actual evidence but on which black-box algorithm the lab happened to license. Defense attorneys rarely have the resources or technical knowledge to challenge these numbers, and most judges lack the scientific background to evaluate competing claims about Markov chain Monte Carlo sampling parameters. This problem persists because both STRmix and TrueAllele treat their source code as proprietary trade secrets, making independent scientific review nearly impossible. The forensic community has no consensus standard for which modeling assumptions are correct. Each vendor has published validation studies on its own software, but there is no independent body requiring head-to-head comparison on the same evidence. Labs choose software based on cost and vendor relationships, not on demonstrated superiority. The result is that the criminal justice system has outsourced a life-or-death determination to competing commercial products that give contradictory answers, and there is no mechanism to resolve the contradiction.

Evidence

Thompson (2023) 'Uncertainty in probabilistic genotyping of low template DNA: A case study comparing STRmix and TrueAllele' in Journal of Forensic Sciences: https://onlinelibrary.wiley.com/doi/full/10.1111/1556-4029.15225 | Harvard JOLT analysis of black-box algorithms in DNA interpretation: https://jolt.law.harvard.edu/assets/articlePDFs/v31/31HarvJLTech275.pdf | ProPublica investigation 'Where Traditional DNA Testing Fails, Algorithms Take Over': https://www.propublica.org/article/where-traditional-dna-testing-fails-algorithms-take-over | Criminal Legal News (2025) 'Probabilistic Genotyping on Trial': https://www.criminallegalnews.org/news/2025/aug/1/probabilistic-genotyping-trial-can-we-trust-secret-algorithms-deciding-guilt/

Comments