Skip to main content

Advertisement

Log in

ROC and AUC with a Binary Predictor: a Potentially Misleading Metric

  • Software Abstract
  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

In analysis of binary outcomes, the receiver operator characteristic (ROC) curve is heavily used to show the performance of a model or algorithm. The ROC curve is informative about the performance over a series of thresholds and can be summarized by the area under the curve (AUC), a single number. When a predictor is categorical, the ROC curve has one less than number of categories as potential thresholds; when the predictor is binary, there is only one threshold. As the AUC may be used in decision-making processes on determining the best model, it important to discuss how it agrees with the intuition from the ROC curve. We discuss how the interpolation of the curve between thresholds with binary predictors can largely change the AUC. Overall, we show using a linear interpolation from the ROC curve with binary predictors corresponds to the estimated AUC, which is most commonly done in software, which we believe can lead to misleading results. We compare R, Python, Stata, and SAS software implementations. We recommend using reporting the interpolation used and discuss the merit of using the step function interpolator, also referred to as the “pessimistic” approach by Fawcett (2006).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  • Allaire, J.J., Ushey, K., Tang, Y. (2018). Reticulate: interface to ‘Python’. https://github.com/rstudio/reticulate.

  • Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12(4), 387–415.

    Article  MathSciNet  Google Scholar 

  • Blumberg, D.M., De Moraes, C.G., Liebmann, J.M., Garg, R., Chen, C., Theventhiran, A., Hood, D.C. (2016). Technology and the glaucoma suspect. Investigative Ophthalmology & Visual Science, 57(9), OCT80–OCT85.

    Article  Google Scholar 

  • Budwega, J., Sprengerb, T., De Vere-Tyndall, A., Hagenkordd, A., Stippichd, C., Bergera, C.T. (2016). Factors associated with significant MRI findings in medical walk-in patients with acute headache. Swiss Medical Weekly, 146, w14349.

    Google Scholar 

  • DeLong, E.R, DeLong, D.M, Clarke-Pearson, D.L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 837–45.

  • Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8), 861–74.

    Article  MathSciNet  Google Scholar 

  • Glaveckaite, S., Valeviciene, N., Palionis, D., Skorniakov, V., Celutkiene, J., Tamosiunas, A., Uzdavinys, G., Laucevicius, A. (2011). Value of scar imaging and inotropic reserve combination for the prediction of segmental and global left ventricular functional recovery after revascularisation. Journal of Cardiovascular Magnetic Resonance, 13(1), 35.

    Article  Google Scholar 

  • Hanley, J.A, & McNeil, B.J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36.

    Article  Google Scholar 

  • Hsu, Y.-C., & Lieli, R. (2014). Inference for ROC curves based on estimated predictive indices: a note on testing AUC = 0.5. Unpublished Manuscript.

  • Hunter, J.D. (2007). Matplotlib: a 2D graphics environment. Computing in Science & Engineering, 9(3), 90–95. https://doi.org/10.1109/MCSE.2007.55.

    Article  Google Scholar 

  • Kushnir, V.A, Darmon, S.K, Barad, D.H, Gleicher, N. (2018). Degree of mosaicism in trophectoderm does not predict pregnancy potential: a corrected analysis of pregnancy outcomes following transfer of mosaic embryos. Reproductive Biology and Endocrinology, 16(1), 6.

    Article  Google Scholar 

  • Litvin, TV, Bresnick, GH, Cuadros, JA, Selvin, S, Kanai, K, Ozawa, GY. (2017). A revised approach for the detection of sight-threatening diabetic macular edema. JAMA Ophthalmology, 135(1), 62–68. https://doi.org/10.1001/jamaophthalmol.2016.4772.

    Article  Google Scholar 

  • Maverakis, E., Ma, C., Shinkai, K., et al. (2018). Diagnostic criteria of ulcerative pyoderma gangrenosum: a Delphi consensus of international experts. JAMA Dermatology, 154(4), 461–66. https://doi.org/10.1001/jamadermatol.2017.5980.

    Article  Google Scholar 

  • Mwipatayi, B.P, Sharma, S., Daneshmand, A., Thomas, S.D, Vijayan, V., Altaf, N., Garbowski, M., et al. (2016). Durability of the balloon-expandable covered versus bare-metal stents in the covered versus balloon expandable stent trial (COBEST) for the treatment of aortoiliac occlusive disease. Journal of Vascular Surgery, 64(1), 83–94.

    Article  Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., et al. (2011). Scikit-learn: machine learning in python. Journal of Machine Learning Research, 12, 2825–30.

    MathSciNet  MATH  Google Scholar 

  • Pepe, M., Longton, G., Janes, H. (2009). Estimation and comparison of receiver operating characteristic curves. The Stata Journal, 9(1), 1.

    Article  Google Scholar 

  • Peter, E. (2016). Fbroc: fast algorithms to bootstrap receiver operating characteristics curves. https://CRAN.R-project.org/package=fbroc.

  • R Core Team. (2018). R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

  • Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., Müller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12, 77.

    Article  Google Scholar 

  • Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS One, 10(3), e0118432.

    Article  Google Scholar 

  • SAS, S.A.S., & Version, S.T.A.T. (2017). 9.4 [Computer program]. Cary, NC:SAS Institute.

  • Shterev, I.D, Dunson, D.B, Chan, C., Sempowski, G.D. (2018). Bayesian multi-plate high-throughput screening of compounds. Scientific Reports, 8(1), 9551.

    Article  Google Scholar 

  • Sing, T, Sander, O, Beerenwinkel, N, Lengauer, T. (2005). ROCR: visualizing classifier performance R. Bioinformatics, 21(20), 7881. http://rocr.bioinf.mpi-sb.mpg.de.

    Article  Google Scholar 

  • Snarr, B.S, Liu, M.Y, Zuckerberg, J.C, Falkensammer, C.B, Nadaraj, S., Burstein, D., Ho, D., et al. (2017). The parasternal short-axis view improves diagnostic accuracy for inferior sinus venosus type of atrial septal defects by transthoracic echocardiography. Journal of the American Society of Echocardiography, 30(3), 209–15.

    Article  Google Scholar 

  • Stata, S. (2013). Release 13. Statistical software. StataCorp LP, College Station, TX.

  • Tuszynski, J. (2018). caTools: Tools: Moving Window Statistics, GIF, Base64, ROC AUC, Etc. https://CRAN.R-project.org/package=caTools.

  • Veltri, D., Kamath, U., Shehu, A. (2018). Deep learning improves antimicrobial peptide recognition. Bioinformatics, 1, 8.

    Google Scholar 

  • Xiong, X., Li, Q., Yang, W.-S., Wei, X., Hu, X., Wang, X.-C., Zhu, D., Li, R., Cao, D., Xie, P. (2018). Comparison of swirl sign and black hole sign in predicting early hematoma growth in patients with spontaneous intracerebral hemorrhage. Medical Science Monitor: International Medical Journal of Experimental and Clinical Research, 24, 567.

    Article  Google Scholar 

Download references

Funding

This analysis was supported by NIH Grants R01NS060910 and U01NS080824.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Muschelli III.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 118 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Muschelli, J. ROC and AUC with a Binary Predictor: a Potentially Misleading Metric. J Classif 37, 696–708 (2020). https://doi.org/10.1007/s00357-019-09345-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-019-09345-1

Keywords

Navigation