Abstract
Objectives
We examine public attitudes towards false positives and false negatives in criminal justice risk assessment and how people’s choices differ in varying offenses and stages.
Methods
We use data from a factorial survey experiment conducted with a sample of 575 Americans. Respondents were randomly assigned to different conditions in the vignette for the criminal justice process and the offense severity and were asked to choose the cost ratio.
Results
While people prefer the cost ratio with higher false positives, the degree to which they accept false positives is lower than the cost ratios of existing risk assessments. The offense severity impacts people’s acceptance of false positives. Meanwhile, numeracy influences people’s decisions on the cost ratio.
Conclusions
To our knowledge, this is the first study to investigate public opinion on the cost ratio in risk assessments. We suggest that public opinion on the cost ratio can be an alternative way to find the ideal cost ratio.
Similar content being viewed by others
References
Athey, S. (2017). Beyond prediction: Using big data for policy problems. Science, 355(6324), 483–485. https://doi.org/10.1126/science.aal4321
Awad, E., Anderson, M., Anderson, S. L., & Liao, B. (2020). An approach for combining ethical principles with public opinion to guide public policy. Artificial Intelligence, 287, Article 103349. https://doi.org/10.1016/j.artint.2020.103349
Barnes, G. C. (Principal Investigator), & Hyatt, J. M. (2012). Classifying adult probationers by forecasting future offending (Grant No. 2008-IJ-CX–0024). National Institute of Justice. https://www.ojp.gov/pdffiles1/nij/grants/238082.pdf
Berinsky AJ, Huber GA, Lenz GS (2012) Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis 20(3):351–368. https://doi.org/10.1093/pan/mpr057
Berk, R. A. (2019). Machine learning risk assessments in criminal justice settings. Springer Nature Switzerland. https://doi.org/10.1007/978-3-030-02272-3
Berk, R. A., & Rossi, P. H. (1997). Just punishments: Federal guidelines and public views compared. Aldine Transaction.
Berk, R. A., Sorenson, S. B., & Barnes, G. (2016). Forecasting domestic violence: A machine learning approach to help inform arraignment decisions. Journal of Empirical Legal Studies, 13(1), 94–115. https://doi.org/10.1111/jels.12098
Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., & Shadbolt, N. (2018). “It’s reducing a human being to a percentage.” In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1–14). New York, NY, USA: ACM. https://doi.org/10.1145/3173574.3173951
Blackstone, W. (2016). The Oxford edition of Blackstone: Commentaries on the laws of England, Vol. 4: Of Public Wrongs. (R. Paley, Ed.). Oxford University Press. https://doi.org/10.1093/actrade/9780199601028.book.1
Brauneis, R., & Goodman, E. (2017). Algorithmic transparency for the smart city. Yale Journal of Law & Technology, 20, 103–176. https://doi.org/10.2139/ssrn.3012499
Coppock, A. (2019). Generalizing from survey experiments conducted on Mechanical Turk: A replication approach. Political Science Research and Methods, 7(3), 613–628. Cambridge Core. https://doi.org/10.1017/psrm.2018.10
Coppock, A., Leeper, T. J., & Mullinix, K. J. (2018). Generalizability of heterogeneous treatment effect estimates across samples. Proceedings of the National Academy of Sciences, 115(49), 12441–12446. https://doi.org/10.1073/pnas.1808083115
DeKay, M. L. (1996). The difference between Blackstone-like error ratios and probabilistic standards of proof. Law & Social Inquiry, 21(01), 95–132. https://doi.org/10.1111/j.1747-4469.1996.tb00013.x
de Keijser, J. W., de Lange, E. G., & van Wilsem, J. A. (2014). Wrongful convictions and the Blackstone ratio: An empirical analysis of public attitudes. Punishment & Society, 16(1), 32–49. https://doi.org/10.1177/1462474513504800
DeMichele, M., Baumgartner, P., Wenger, M., Barrick, K., & Comfort, M. (2020). Public safety assessment: Predictive utility and differential prediction by race in Kentucky. Criminology & Public Policy, 19(2), 409–431. https://doi.org/10.1111/1745-9133.12481
Desmarais, S. L., & Singh, J. P. (2013). Risk assessment instruments validated and implemented in correctional settings in the United States. The Council of State Governments (CSG) Justice Center. https://csgjusticecenter.org/wp-content/uploads/2020/02/Risk-Instruments-Guide.pdf
Dieterich, W., Mendoza, C., & Brennan, T. (2016). COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Retrieved from https://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf
Dressel, J., & Farid, H. (2018). The accuracy, fairness, and limits of predicting recidivism. Science Advances, 4(1), Article eaao5580. https://doi.org/10.1126/sciadv.aao5580
Green, B., & Chen, Y. (2019). The principles and limits of algorithm-in-the-loop decision making. Proc. ACM Human-Computer Interaction., 3(CSCW). https://doi.org/10.1145/3359152
EPIC. (2020). Liberty at risk: Pre-trial risk assessment tools in the U.S. https://archive.epic.org/LibertyAtRiskReport.pdf
Fagerlin, A., Zikmund-Fisher, B. J., Ubel, P. A., Jankovic, A., Derry, H. A., & Smith, D. M. (2007). Measuring numeracy without a math test: Development of the subjective numeracy scale. Medical Decision Making, 27(5), 672–680. https://doi.org/10.1177/0272989X07304449
Freedman, D. A. (2006). Statistical models for causation. Evaluation Review, 30(6), 691–713. https://doi.org/10.1177/0193841X06293771
Furman v. Georgia, 408 U.S. 238, 92 S. Ct. 2726, 33 L. Ed. 2d 346 (1972).
Hamilton, M. (2020). Judicial gatekeeping on scientific validity with risk assessment tools. Behavioral Sciences & the Law, 38(3), 226–245. https://doi.org/10.1002/bsl.2456
Hannah-Moffat, K. (2013). Actuarial sentencing: An “unsettled” proposition. Justice Quarterly, 30(2), 270–296. https://doi.org/10.1080/07418825.2012.682603
Harris, H., Goss, J., & Gumbs, A. (2019). Pretrial risk assessment in California. Public Policy Institute of California. https://www.ppic.org/publication/pretrial-risk-assessment-in-california
Harrison, G., Hanson, J., Jacinto, C., Ramirez, J., & Ur, B. (2020, January 27 - 30). An empirical study on the perceived fairness of realistic, imperfect machine learning models. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 392–402. https://doi.org/10.1145/3351095.3372831
Hartmann, K., & Wenzelburger, G. (2021). Uncertainty, risk and the use of algorithms in policy decisions: A case study on criminal justice in the USA. Policy Sciences. https://doi.org/10.1007/s11077-020-09414-y
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv, Article 1609.05807. http://arxiv.org/abs/1609.05807
Kraemer, F., van Overveld, K., & Peterson, M. (2011). Is there an ethics of algorithms? Ethics and Information Technology, 13(3), 251–260. https://doi.org/10.1007/s10676-010-9233-7
Larson, J., Mattu, S., Kirchner, L., & Angwin, J. (2016). How we analyzed the COMPAS recidivism algorithm. Retrieved May 12, 2021, from https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Lempert, R. O. (1976). Modeling relevance essay. Michigan Law Review, 75(Issues 5 & 6), 1021–1057. https://repository.law.umich.edu/cgi/viewcontent.cgi?article=4052&context=mlr
Lillquist, E. (2002). Recasting reasonable doubt: Decision theory and the virtues of variability. U.C. Davis Law Review, 36(1), 85–198. https://https://doi.org/10.2139/ssrn.349820
Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique. The Annals of Applied Statistics, 7(1), 295–318. https://doi.org/10.1214/12-AOAS583
Lipkus, I. M., Samsa, G., & Rimer, B. K. (2001). General performance on a numeracy scale among highly educated samples. Medical Decision Making, 21(1), 37–44. https://doi.org/10.1177/0272989X0102100105
Long, J. S. (1997). Ordinal outcomes. SAGE Publications Inc.
Monahan, J., & Silver, E. (2003). Judicial decision thresholds for violence risk management. International Journal of Forensic Mental Health, 2(1), 1–6. https://doi.org/10.1080/14999013.2003.10471174
Monahan, J., Metz, A. L., & Garrett, B. L. (2018). Judicial appraisals of risk assessment in sentencing. Behavioral Sciences & the Law, 36(5), 565–575. https://doi.org/10.1002/bsl.2380
Mossman, D., & Hart, K. J. (1993). How bad is civil commitment? A study of attitudes toward violence and involuntary hospitalization. Bulletin of the American Academy of Psychiary and the Law, 21(2), 181–194.
Mullinix, K. J., Leeper, T. J., Druckman, J. N., & Freese, J. (2015). The generalizability of survey experiments. Journal of Experimental Political Science, 2(2), 109–138. https://doi.org/10.1017/XPS.2015.19 Cambridge Core.
Nagel S., Lamm D., Neef M. (1981) Decision theory and juror decision-making. In: Sales B.D. (eds) The trial process. Perspectives in Law & Psychology, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4684-3767-6_10
Netter, B. (2007). Using groups statistics to sentence individual criminals: An ethical and statistical critique of the Virginia Risk Assessment Program. Journal of Criminal Law and Criminology, 97(3), 699–729. https://www.proquest.com/scholarly-journals/using-group-statistics-sentence-individual/docview/218408627/se-2?accountid=14166
Oswald, M., Grace, J., Urwin, S., & Barnes, G. C. (2018). Algorithmic risk assessment policing models: Lessons from the Durham HART model and ‘Experimental’ proportionality. Information & Communications Technology Law, 27(2), 223–250. https://doi.org/10.1080/13600834.2018.1458455
Robinson, P. H. (2008). Distributive principles of criminal law. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195365757.001.0001
Robinson, P. H., Goodwin, G. P., & Reisig, M. D. (2010). The disutility of injustice. New York University Law Review, 85, 1940–2033. https://asu.pure.elsevier.com/en/publications/the-disutility-of-injustice
Rudin, C., Wang, C., & Coker, B. (2020). The age of secrecy and unfairness in recidivism prediction. Harvard Data Science Review, 2(1). https://doi.org/10.1162/99608f92.6ed64b30
Saxena, N. A., Huang, K., DeFilippis, E., Radanovic, G., Parkes, D. C., & Liu, Y. (2019). How do fairness definitions fare? Examining public attitudes towards algorithmic definitions of fairness. Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 99–106. https://doi.org/10.1145/3306618.3314248
Scurich, N. (2015). Criminal justice policy preferences: Blackstone ratios and the veil of ignorance. Stanford Law & Policy Review., 26, 23–35.
Scurich, N. (2016). Structured risk assessment and legal decision-making. In M. Miller & B. Bornstein (Eds.), Advances in psychology and law (Vol. 1, pp. 159–183). Springer, Cham. https://doi.org/10.1007/978-3-319-29406-3_5
Scurich, N. (2018). The case against categorical risk estimates. Behavioral Sciences & the Law, 36(5), 554–564. https://doi.org/10.1002/bsl.2382
Scurich, N., & Krauss, D. A. (2020). Public’s views of risk assessment algorithms and pretrial decision making. Psychology, Public Policy, and Law, 26(1), 1–9. https://doi.org/10.1037/law0000219
Simons, D. J., & Chabris, C. F. (2012). Common (mis)beliefs about memory: A replication and comparison of telephone and Mechanical Turk survey methods. PLOS ONE, 7(12), Article e51876. https://doi.org/10.1371/journal.pone.0051876
Sommer, R., Sommer, B. A., & Heidmets, M. (1991). Release of the guilty to protect the innocent. Criminal Justice and Behavior, 18(4), 480–490. https://doi.org/10.1177/0093854891018004008
Snowberg, E., & Yariv, L. (2021). Testing the waters: Behavior across participant pools. American Economic Review, 111(2), 687–719. https://doi.org/10.1257/aer.20181065
Srivastava, M., Heidari, H., & Krause, A. (2019). Mathematical notions vs. human perception of fairness: A descriptive approach to fairness for machine learning. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2459–2468. https://doi.org/10.1145/3292500.3330664
Weinberg, J., Freese, J., & McElhattan, D. (2014). Comparing data characteristics and results of an online factorial survey between a population-based and a crowdsource-recruited sample. Sociological Science, 1(19), 292–310. https://doi.org/10.15195/v1.a19
U.S. v. Booker, 543 U.S. 220, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005).
Wenk, E., Halatyn, T., & Springer, S. (1976). The diagnostic parole prediction index (No. 75-NI-99–0039; pp. 1–73). Research Center National Council on Crime and Delinquency. https://www.ojp.gov/pdffiles1/Digitization/39861NCJRS.pdf
Winship, C., & Mare, R. D. (1984). Regression models with ordinal variables. American Sociological Review, 49(4), 512. https://doi.org/10.2307/2095465
Xiong, M., Greenleaf, R. G., & Goldschmidt, J. (2017). Citizen attitudes toward errors in criminal justice: Implications of the declining acceptance of Blackstone’s ratio. International Journal of Law, Crime and Justice, 48, 14–26. https://doi.org/10.1016/j.ijlcj.2016.10.001
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kang, B., Wu, S. False positives vs. false negatives: public opinion on the cost ratio in criminal justice risk assessment. J Exp Criminol 19, 919–941 (2023). https://doi.org/10.1007/s11292-022-09512-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11292-022-09512-2