Home > Archive > Volume 66, No. 3 > This paper

Detecting Systematic Bias in Criminal Racial Assignment

Daniel Lee Van Pelt

Published: 2026/03/01

Abstract

We analysed racial classification in U.S. Department of Corrections databases across 14 states comprising 1.5 million criminal records. An accurate linear model trained on biased data learns the underlying signal rather than the bias itself; we interpret systematic deviations between model predictions and official classifications as evidence of mislabelling by authorities rather than model error. Using facial recognition algorithms on mugshots and name-based demographic data, we achieved 92.76% agreement with assigned race labels. We identified substantial misclassification: 29% of predicted Hispanics were officially assigned as White. This pattern persisted among high-confidence model predictions (median confidence 91%). Correcting for misclassification increased Hispanic criminal count rates by 31%, decreased White rates by 6%, and decreased Black rates by 1%. Simulation studies confirmed that the pattern resembled random rather than deliberate bias. State-level analysis (n = 14) revealed no statistically significant association with political ideology (r = .21, 95% CI: −0.36 to 0.67, p = .473). The proportion of predicted Hispanics assigned as White and the proportion of predicted Whites assigned as Hispanic both correlated with Native American ancestry among Latinos (r = −.80, 95% CI: −0.95 to −0.38, p = .003, n = 11; r = .74, 95% CI: 0.26 to 0.93, p = .009, n = 11).

   Download PDF