Statistical Evaluation of Medical Tests

Vanda Inácio; María Xosé Rodríguez-Álvarez; Pilar Gayoso-Diz

doi:10.1146/annurev-statistics-040720-022432

Annual Review of Statistics and Its Application

Volume 8, 2021

Review Article

Free

Statistical Evaluation of Medical Tests

Vanda Inácio¹, María Xosé Rodríguez-Álvarez^2,3, and Pilar Gayoso-Diz⁴
View Affiliations Hide Affiliations

Affiliations: ¹School of Mathematics, University of Edinburgh, EH9 3FD Edinburgh, United Kingdom; email: [email protected] ²Basque Center for Applied Mathematics, E-48009 Bilbao, Spain ³Ikerbasque–Basque Foundation for Science, E-48009 Bilbao, Spain ⁴Instituto de Salud Carlos III, 28029 Madrid, Spain
Vol. 8:41-67 (Volume publication date March 2021) https://doi.org/10.1146/annurev-statistics-040720-022432
First published as a Review in Advance on November 06, 2020
Copyright © 2021 by Annual Reviews. All rights reserved

Abstract

In this review, we present an overview of the main aspects related to the statistical evaluation of medical tests for diagnosis and prognosis. Measures of diagnostic performance for binary tests, such as sensitivity, specificity, and predictive values, are introduced, and extensions to the case of continuous-outcome tests are detailed. Special focus is placed on the receiver operating characteristic (ROC) curve and its estimation, with emphasis on the topic of covariate adjustment. The extension to the case of time-dependent ROC curves for evaluating prognostic accuracy is also touched upon. We apply several of the approaches described to a data set derived from a study aimed to evaluate the ability of homeostasis model assessment of insulin resistance (HOMA-IR) levels to identify individuals at high cardio-metabolic risk and how such discriminatory ability might be influenced by age and gender. We also outline software available for the implementation of the methods.

Keyword(s): accuracy, classification, covariates, decision thresholds, diagnostic test, prognostic test, receiver operating characteristic curve

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040720-022432

2021-03-07

2024-04-26

Full text loading...

/deliver/fulltext/statistics/8/1/annurev-statistics-040720-022432.html?itemId=/content/journals/10.1146/annurev-statistics-040720-022432&mimeType=html&fmt=ahah

Literature Cited

Alonzo TA, Pepe MS. 2002. Distribution-free ROC analysis using binary regression techniques. Biostatistics 3:421–32
[Google Scholar]
Bamber D. 1975. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol. 12:387–415
[Google Scholar]
Blanche P, Dartigues JF, Jacqmin-Gadda H 2013. Review and comparison of ROC curve estimators for a time-dependent outcome with marker-dependent censoring. Biom. J. 55:687–704
[Google Scholar]
Branscum AJ, Johnson WO, Hanson TE, Baron AT 2015. Flexible regression models for ROC and risk analysis, with or without a gold standard. Stat. Med. 34:3997–4015
[Google Scholar]
Branscum AJ, Johnson WO, Hanson TE, Gardner IA 2008. Bayesian semiparametric ROC curve estimation and disease diagnosis. Stat. Med. 27:2474–96
[Google Scholar]
Broemeling LD. 2016. Advanced Bayesian Methods for Medical Test Accuracy Boca Raton, FL: Chapman and Hall/CRC
Brownie C, Habicht JP, Cogill B 1986. Comparing indicators of health or nutritional status. Am. J. Epidemiol. 124:1031–44
[Google Scholar]
Cai T. 2004. Semi-parametric ROC regression analysis with placement values. Biostatistics 5:45–60
[Google Scholar]
Cai T, Dodd LE. 2008. Regression analysis for the partial area under the ROC curve. Stat. Sin. 18:817–36
[Google Scholar]
Cai T, Moskowitz CS. 2004. Semi-parametric estimation of the binormal ROC curve for a continuous diagnostic test. Biostatistics 5:573–86
[Google Scholar]
Cai T, Pepe MS. 2002. Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. J. Am. Stat. Assoc. 97:1099–107
[Google Scholar]
Chambless LE, Diao G. 2006. Estimation of time-dependent area under the ROC curve for long-term risk prediction. Stat. Med. 25:3474–86
[Google Scholar]
Dodd LE, Pepe MS. 2003a. Partial AUC estimation and regression. Biometrics 59:614–23
[Google Scholar]
Dodd LE, Pepe MS. 2003b. Semiparametric regression for the area under the receiver operating characteristic curve. J. Am. Stat. Assoc. 98:409–17
[Google Scholar]
Dorfman DD, Alf E. 1969. Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—rating-method data. J. Math. Psychol. 6:487–96
[Google Scholar]
Erkanli A, Sung M, Costello EJ, Angold A 2006. Bayesian semi-parametric ROC analysis. Stat. Med. 25:3905–28
[Google Scholar]
Escobar MD, West M. 1995. Bayesian density estimation and inference using mixtures. J. Am. Stat. Assoc. 90:577–88
[Google Scholar]
Fan J, Gijbels I. 1996. Local Polynomial Modelling and Its Applications Boca Raton, FL: Chapman and Hall/CRC
Faraggi D. 2003. Adjusting receiver operating characteristic curves and related indices for covariates. J. R. Stat. Soc. Ser. D 52:179–92
[Google Scholar]
Fawcett T. 2006. An introduction to ROC analysis. Pattern Recognit. Lett. 27:861–74
[Google Scholar]
Ferguson TS. 1973. A Bayesian analysis of some nonparametric problems. Ann. Stat. 1:209–30
[Google Scholar]
Fluss R, Faraggi D, Reiser B 2005. Estimation of the Youden index and its associated cutoff point. Biometrical J 47:458–72
[Google Scholar]
Gayoso-Diz P, Otero-González A, Rodriguez-Alvarez MX, Gude F, García F et al. 2013. Insulin resistance (HOMA-IR) cut-off values and the metabolic syndrome in a general adult population: effect of gender and age: EPIRCE cross-sectional study. BMC Endocr. Disord. 13:47
[Google Scholar]
Gneiting T, Vogel P. 2018. Receiver operating characteristic (ROC) curves. arXiv:1809.04808 [stat.ME]
Goddard M, Hinberg I. 1990. Receiver operator characteristic (ROC) curves and non-normal data: an empirical study. Stat. Med. 9:325–37
[Google Scholar]
Gonçalves L, Subtil A, Oliveira MR, Bermudez P 2014. ROC curve estimation: an overview. REVSTAT Stat. J. 12:1–20
[Google Scholar]
González-Manteiga W, Pardo-Fernández JC, Van Keilegom I 2011. ROC curves in non-parametric location-scale regression models. Scand. J. Stat. 38:169–84
[Google Scholar]
Gu J, Ghosal S. 2009. Bayesian ROC curve estimation under binormality using a rank likelihood. J. Stat. Plan. Inference 139:2076–83
[Google Scholar]
Gu J, Ghosal S, Roy A 2008. Bayesian bootstrap estimation of ROC curve. Stat. Med. 27:5407–20
[Google Scholar]
Hanley JA. 1996. The use of the “binormal” model for parametric ROC analysis of quantitative diagnostic tests. Stat. Med. 15:1575–85
[Google Scholar]
Heagerty PJ, Lumley T, Pepe MS 2000. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56:337–44
[Google Scholar]
Heagerty PJ, Zheng Y. 2005. Survival model predictive accuracy and ROC curves. Biometrics 61:92–105
[Google Scholar]
Hsieh F, Turnbull BW. 1996. Nonparametric and semiparametric estimation of the receiver operating characteristic curve. Ann. Stat. 24:25–40
[Google Scholar]
Hung H, Chiang CT. 2010. Optimal composite markers for time-dependent receiver operating characteristic curves with censored survival data. Scand. J. Stat. 37:664–79
[Google Scholar]
Inácio de Carvalho V, Jara A, Hanson TE, de Carvalho M 2013. Bayesian nonparametric ROC regression modeling. Bayesian Anal 8:623–46
[Google Scholar]
Inácio de Carvalho V, Rodríguez-Álvarez MX 2018. Bayesian nonparametric inference for the covariate-adjusted ROC curve. arXiv:1806.00473 [stat.ME]
book 2006. IDF consensus worldwide definition of the metabolic syndrome Consens. Statement, Int. Diabetes Fed Brussels: https://www.idf.org/e-library/consensus-statements/60-idfconsensus-worldwide-definitionof-the-metabolic-syndrome.html
Ishwaran H, James LF. 2002. Approximate Dirichlet process computing in finite normal mixtures: smoothing and prior information. J. Comput. Graph. Stat. 11:508–32
[Google Scholar]
Janes H, Pepe MS. 2008a. Adjusting for covariates in studies of diagnostic, screening, or prognostic markers: an old concept in a new setting. Am. J. Epidemiol. 168:89–97
[Google Scholar]
Janes H, Pepe MS. 2008b. Matching in studies of classification accuracy: implications for analysis, efficiency, and assessment of incremental value. Biometrics 64:1–9
[Google Scholar]
Janes H, Pepe MS. 2009. Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. Biometrika 96:371–82
[Google Scholar]
Kamarudin AN, Cox T, Kolamunnage-Dona R 2017. Time-dependent ROC curve analysis in medical research: current methods and applications. BMC Med. Res. Methodol. 17:53
[Google Scholar]
Kim S, Huang Y. 2017. Combining biomarkers for classification with covariate adjustment. Stat. Med. 36:2347–62
[Google Scholar]
Krzanowski WJ, Hand DJ. 2009. ROC Curves for Continuous Data Boca Raton, FL: Chapman and Hall/CRC
Li J, Zhou X, Fine JP 2012. A regression approach to ROC surface, with applications to Alzheimer's disease. Sci. China Math. 55:1583–95
[Google Scholar]
Lin H, Zhou XH, Li G 2012. A direct semiparametric receiver operating characteristic curve regression with unknown link and baseline functions. Stat. Sin. 22:1427–56
[Google Scholar]
Liu C, Liu A, Halabi S 2011. A min–max combination of biomarkers to improve diagnostic accuracy. Stat. Med. 30:2005–14
[Google Scholar]
Liu D, Zhou XH. 2013. ROC analysis in biomarker combination with covariate adjustment. Acad. Radiol. 20:874–82
[Google Scholar]
Lloyd CJ. 1998. Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. J. Am. Stat. Assoc. 93:1356–64
[Google Scholar]
López-de Ullibarri I, Cao R, Cadarso-Suárez C, Lado MJ 2008. Nonparametric estimation of conditional ROC curves: application to discrimination tasks in computerized detection of early breast cancer. Comput. Stat. Data Anal. 52:2623–31
[Google Scholar]
Martínez-Camblor P, Corral N, Rey C, Pascual J, Cernuda-Morollón E 2017. Receiver operating characteristic curve generalization for non-monotone relationships. Stat. Methods Med. Res. 26:113–23
[Google Scholar]
Martínez-Camblor P, Pardo-Fernández JC. 2018. Smooth time-dependent receiver operating characteristic curve estimators. Stat. Methods Med. Res. 27:651–74
[Google Scholar]
Metz CE. 1978. Basic principles of ROC analysis. Semin. Nucl. Med. 8:283–98
[Google Scholar]
Metz CE. 1986. ROC methodology in radiologic imaging. Investig. Radiol. 21:720–33
[Google Scholar]
Metz CE, Herman BA, Shen JH 1998. Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. Stat. Med. 17:1033–53
[Google Scholar]
Nakas CT. 2014. Developments in ROC surface analysis and assessment of diagnostic markers in three-class classification problems. REVSTAT Stat. J. 12:43–65
[Google Scholar]
Nakas CT, Alonzo TA, Yiannoutsos CT 2010. Accuracy and cut-off point selection in three-class classification problems using a generalization of the Youden index. Stat. Med. 29:2946–55
[Google Scholar]
Nakas CT, Yiannoutsos CT. 2004. Ordered multiple-class ROC analysis with continuous measurements. Stat. Med. 23:3437–49
[Google Scholar]
Otero A, de Francisco A, Gayoso P, García F 2010. Prevalence of chronic renal disease in Spain: results of the EPIRCE study. Nefrología 30:78–86
[Google Scholar]
Otero A, Gayoso P, Garcia F, de Francisco AL 2005. Epidemiology of chronic renal disease in the Galician population: results of the pilot Spanish EPIRCE study. Kidney Int. Suppl. 99:S16–9
[Google Scholar]
Pardo-Fernández JC, Rodríguez Álvarez MX, Van Keilegom I 2014. A review on ROC curves in the presence of covariates. REVSTAT Stat. J. 12:21–41
[Google Scholar]
Pepe MS. 1998. Three approaches to regression analysis of receiver operating characteristic curves for continuous test results. Biometrics 54:124–35
[Google Scholar]
Pepe MS. 2000. An interpretation for the ROC curve and inference using GLM procedures. Biometrics 56:352–59
[Google Scholar]
Pepe MS. 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction Oxford, UK: Oxford Univ. Press
Pepe MS, Cai T. 2004. The analysis of placement values for evaluating discriminatory measures. Biometrics 60:528–35
[Google Scholar]
Pepe MS, Cai T, Longton G 2006. Combining predictors for classification using the area under the receiver operating characteristic curve. Biometrics 62:221–29
[Google Scholar]
book 2020. A language and environment for statistical computing. Statistical Software R Found. Stat. Comput Vienna:
[Google Scholar]
Rodríguez A, Martínez JC. 2014. Bayesian semiparametric estimation of covariate-dependent ROC curves. Biostatistics 15:353–69
[Google Scholar]
Rodríguez-Álvarez MX, Inácio V. 2020. ROCnReg: An R package for receiver operating characteristic curve inference with and without covariate information. arXiv:2003.13111 [stat.ME]
Rodríguez-Álvarez MX, Meira-Machado L, Abu-Assi E, Raposeiras-Roubín S 2016. Nonparametric estimation of time-dependent ROC curves conditional on a continuous covariate. Stat. Med. 35:1090–102
[Google Scholar]
Rodríguez-Álvarez MX, Roca-Pardiñas J, Cadarso-Suárez C 2011a. A new flexible direct ROC regression model: application to the detection of cardiovascular risk factors by anthropometric measures. Comput. Stat. Data Anal. 55:3257–70
[Google Scholar]
Rodríguez-Álvarez MX, Roca-Pardiñas J, Cadarso-Suárez C 2011b. ROC curve and covariates: extending induced methodology to the non-parametric framework. Stat. Comput. 21:483–99
[Google Scholar]
Sethuraman J. 1994. A constructive definition of Dirichlet priors. Stat. Sin. 4:639–50
[Google Scholar]
Shiu SY, Gatsonis C. 2008. The predictive receiver operating characteristic curve for the joint assessment of the positive and negative predictive values. Philos. Trans. R. Soc. A 366:2313–33
[Google Scholar]
Silverman BW. 1986. Density Estimation for Statistics and Data Analysis Boca Raton, FL: Chapman and Hall/CRC
Song X, Zhou XH. 2008. A semiparametric approach for the covariate specific ROC curve with survival outcome. Stat. Sin. 18:947–65
[Google Scholar]
Su JQ, Liu JS. 1993. Linear combinations of multiple diagnostic markers. J. Am. Stat. Assoc. 88:1350–55
[Google Scholar]
Swets JA. 1986. Indices of discrimination or diagnostic accuracy: their ROCs and implied models. Psychol. Bull. 99:100–17
[Google Scholar]
Uno H, Cai T, Tian L, Wei LJ 2007. Evaluating prediction rules for t-year survivors with censored regression models. J. Am. Stat. Assoc. 102:527–37
[Google Scholar]
Xu T, Wang J, Fang Y 2014. A model-free estimation for the covariate-adjusted Youden index and its associated cut-point. Stat. Med. 33:4963–74
[Google Scholar]
Yao F, Craiu RV, Reiser B 2010. Nonparametric covariate adjustment for receiver operating characteristic curves. Can. J. Stat. 38:27–46
[Google Scholar]
Youden WJ. 1950. Index for rating diagnostic tests. Cancer 3:32–35
[Google Scholar]
Zhao L, Feng D, Chen G, Taylor JM 2016. A unified Bayesian semiparametric approach to assess discrimination ability in survival analysis. Biometrics 72:554–62
[Google Scholar]
Zheng Y, Heagerty PJ. 2004. Semiparametric estimation of time-dependent ROC curves for longitudinal marker data. Biostatistics 5:615–32
[Google Scholar]
Zhou XH, Harezlak J. 2002. Comparison of bandwidth selection methods for kernel smoothing of ROC curves. Stat. Med. 21:2045–55
[Google Scholar]
Zhou XH, McClish DK, Obuchowski NA 2011. Statistical Methods in Diagnostic Medicine New York: Wiley
Zou KH, Hall W. 2000. Two transformation models for estimating an ROC curve derived from continuous data. J. Appl. Stat. 27:621–31
[Google Scholar]
Zou KH, Hall W, Shapiro DE 1997. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat. Med. 16:2143–56
[Google Scholar]
Zou KH, Tempany CM, Fielding JR, Silverman SG 1998. Original smooth receiver operating characteristic curve estimation from continuous data: statistical methods for analyzing the predictive value of spiral CT of ureteral stones. Acad. Radiol. 5:680–87
[Google Scholar]

/content/journals/10.1146/annurev-statistics-040720-022432

Statistical Evaluation of Medical Tests

Annual Review of Statistics and Its Application 8, 41 (2021); https://doi.org/10.1146/annurev-statistics-040720-022432

/content/journals/10.1146/annurev-statistics-040720-022432

Data & Media loading...

Supplemental Material

Supplementary Data

Download the Supplemental Appendix (PDF). Includes Supplemental Figures 1-3 and Supplemental Table 1.

Article Type: Review Article

Most Cited Most Cited RSS feed

- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 8, 2021

Review Article

Free

Statistical Evaluation of Medical Tests

Abstract

Supplementary Data

Most Read This Month

Most Cited Most Cited RSS feed