Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter August 7, 2020

Estimating the area under a receiver operating characteristic curve using partially ordered sets

  • Ehsan Zamanzade EMAIL logo and Xinlei Wang EMAIL logo

Abstract

Ranked set sampling (RSS), known as a cost-effective sampling technique, requires that the ranker gives a complete ranking of the units in each set. Frey (2012) proposed a modification of RSS based on partially ordered sets, referred to as RSS-t in this paper, to allow the ranker to declare ties as much as he/she wishes. We consider the problem of estimating the area under a receiver operating characteristics (ROC) curve using RSS-t samples. The area under the ROC curve (AUC) is commonly used as a measure for the effectiveness of diagnostic markers. We develop six nonparametric estimators of the AUC with/without utilizing tie information based on different approaches. We then compare the estimators using a Monte Carlo simulation and an empirical study with real data from the National Health and Nutrition Examination Survey. The results show that utilizing tie information increases the efficiency of estimating the AUC. Suggestions about when to choose which estimator are also made available to practitioners.


Corresponding authors: Ehsan Zamanzade, Department of Statistics, Faculty of Mathematics and Statistics, University of Isfahan, Isfahan, 81746-73441, Iran, E-mail: and Xinlei Wang:Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, Dallas, 75275-0332, TX, USA, E-mail:
The authors wish it to be known that, in their opinion, they both should be equally regarded as the corresponding author.

Acknowledgments

The authors are grateful to two anonymous referees for helpful comments that have resulted in an improved paper.

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: Ehsan Zamanzade's research was supported in part by Iran National Science Foundation (INSF).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. McIntyre, GA. A method for unbiased selective sampling using ranked set sampling. Aust J Agric Res 1952;3:385–90. https://doi.org/10.1071/ar9520385.Search in Google Scholar

2. Stokes, SL, Sager, TW. Characterization of a ranked-set sample with application to estimating distribution functions. J Am Stat Assoc 1988;83:374–81. https://doi.org/10.1080/01621459.1988.10478607.Search in Google Scholar

3. Kvam, PH, Samaniego, FJ. Nonparametric maximum likelihood estimation based on ranked set samples. J Am Stat Assoc 1994;89:526–37. https://doi.org/10.1080/01621459.1994.10476777.Search in Google Scholar

4. Huang, J. Asymptotic properties of Npmle of a distribution function based on ranked set samples. Ann Stat 1997;25:1036–49. https://doi.org/10.1214/aos/1069362737.Search in Google Scholar

5. Duembgen, L, Zamanzade, E. Inference on a distribution function from ranked set samples. Ann Inst Stat Math 2020;72:157–85. https://doi.org/10.1007/s10463-018-0680-y.Search in Google Scholar

6. Chen, H, Stasny, EA, Wolfe, DA. Ranked set sampling for efficient estimation of a population proportion. Stat Med 2005;24:3319–29. https://doi.org/10.1002/sim.2158.Search in Google Scholar

7. Zamanzade, E, Mahdizadeh, M. A more efficient proportion estimator in ranked set sampling. Stat Prob Lett 2017;129:28–33. https://doi.org/10.1016/j.spl.2017.05.001.Search in Google Scholar

8. Takahasi, K, Wakimoto, K. On unbiased estimates of the population mean based on the sample stratified by means of ordering. Ann Inst Stat Math 1968;20:1–31. https://doi.org/10.1007/bf02911622.Search in Google Scholar

9. Frey, J, Feeman, TG. Efficiency comparisons for partially rank-ordered set sampling. Stat Papers 2016;58:1149–66. https://doi.org/10.1007/s00362-016-0742-2.Search in Google Scholar

10. Frey, J, Zhang, Y. Testing perfect rankings in ranked-set sampling with binary data. Can J Stat 2017;45:326–39. https://doi.org/10.1002/cjs.11326.Search in Google Scholar

11. Wang, X, Lim, J, Stokes, L. Using ranked set sampling with cluster randomized designs for improved inference on treatment effects. J Am Stat Assoc 2016;111:1576–90. https://doi.org/10.1080/01621459.2015.1093946.Search in Google Scholar

12. Halls, LK, Dell, TR. Trail of ranked-set sampling for forage yields. Forest Sci 1966;12:22–6. https://doi.org/10.1093/forestscience/12.1.22.Search in Google Scholar

13. Howard, RW, Jones, SC, Mauldin, JK, Beal, RH. Abundance, distribution, and colony size estimates for Reticulitermes spp. (Isopter: Rhinotermitidae) in Southern Mississippi. Environ Entomol 1982;11:1290–3. https://doi.org/10.1093/ee/11.6.1290.Search in Google Scholar

14. Haq, A, Brown, J, Moltchanova, E, Al-Omari, AI. Partial ranked set sampling design. Environmetrics 2013;24:201–7. https://doi.org/10.1002/env.2203.Search in Google Scholar

15. Hatefi, A, Jafari Jozani, M, Ozturk, O. Mixture model analysis of partially rank-ordered set samples: age groups of fish from length-frequency data. Scand J Stat 2015;42:848–71. https://doi.org/10.1111/sjos.12140.Search in Google Scholar

16. Zamanzade, E, Mahdizadeh, M. Using ranked set sampling with extreme ranks in estimating the population proportion. Stat Methods Med Res 2020;29:165–77. https://doi.org/10.1177/0962280218823793.Search in Google Scholar

17. Wang, X, Ahn, S, Lim, J. Unbalanced ranked set sampling in cluster randomized studies. J Stat Plann Inference 2017;187:1–16. https://doi.org/10.1016/j.jspi.2017.02.005.Search in Google Scholar

18. Frey, J. Nonparametric mean estimation using partially ordered sets. Environ Ecol Stat 2012;19:309–26. https://doi.org/10.1007/s10651-012-0188-1.Search in Google Scholar

19. Zamanzade, E, Wang, X. Proportion estimation in ranked set sampling in the presence of tie information. Comput Stat 2018;33:1349–66. https://doi.org/10.1007/s00180-018-0807-x.Search in Google Scholar

20. Zamanzade, E, Wang, X. Improved nonparametric estimation using partially ordered sets. In: Chandra, G, Nautiyal, R, Chandra, H, editors. Statistical methods and applications in forestry and environmental sciences. Forum for interdisciplinary mathematics. Singapore: Springer; 2020.10.1007/978-981-15-1476-0_5Search in Google Scholar

21. Bamber, DC. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 1975;12:387–415. https://doi.org/10.1016/0022-2496(75)90001-2.Search in Google Scholar

22. Kotz, S, Lumelskii, Y, Pensky, M. The stress-strength model and its generalizations. Theory and applications. Singapore: World Scientific; 2003.10.1142/9789812564511Search in Google Scholar

23. Faraggi, D, Reiser, B. Estimation of the area under the ROC curve. Stat Med 2002;21:3093–106. https://doi.org/10.1002/sim.1228.Search in Google Scholar

24. Sengupta, S, Mukhuti, S. Unbiased estimation of P(X>Y) using ranked set sample data. Statistics 2008;42:223–30. https://doi.org/10.1080/02331880701823271.Search in Google Scholar

25. Yin, J, Hao, Y, Samawi, H, Rochani, H. Rank-based kernel estimation of the area under the ROC curve. Stat Methodol 2016;32:91–106. https://doi.org/10.1016/j.stamet.2016.04.001.Search in Google Scholar

26. Mahdizadeh, M, Zamanzade, E. Kernel based estimation of P(X>Y) in ranked set sampling. SORT 2016;40:243–66. https://www.raco.cat/index.php/SORT/article/view/316145.Search in Google Scholar

27. Zou, KH, Hall, WJ, Shapiro, DE. Smooth non-parametric receiver-operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med 1997;16:2143–56. https://doi.org/10.1002/(sici)1097-0258(19971015)16:19<2143::aid-sim655>3.0.co;2-3.10.1002/(SICI)1097-0258(19971015)16:19<2143::AID-SIM655>3.0.CO;2-3Search in Google Scholar

28. MacEachern, SN, Stasny, EA, Wolfe, DA. Judgement post-stratification with imprecise rankings. Biometrics 2004;60:207–15. https://doi.org/10.1111/j.0006-341x.2004.00144.x.Search in Google Scholar

29. Wang, X, Wang, K, Lim, J. Isotonized CDF estimation from judgment poststratification data with empty strata. Biometrics 2012;68:194–202. https://doi.org/10.1111/j.1541-0420.2011.01655.x.Search in Google Scholar

30. Robertson, T, Wright, FT, Dykstra, RL. Order-restricted inferences. New York: Wiley; 1988.Search in Google Scholar

31. Dell, TR, Clutter, JL. Ranked set sampling theory with order statistics background. Biometrics 1972;28:545–55. https://doi.org/10.2307/2556166.Search in Google Scholar

32. Fligner, MA, MacEachern, SN. Nonparametric two-sample methods for ranked-set sample data. J Am Stat Assoc 2006;101:1107–18. https://doi.org/10.1198/016214506000000410.Search in Google Scholar

33. Wang, X, Lim, J, Stokes, L. A nonparametric mean estimator for judgment post-stratified data. Biometrics 2008;64:355–63. https://doi.org/10.1111/j.1541-0420.2007.00900.x.Search in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2019-0127).


Received: 2019-10-29
Accepted: 2020-06-08
Published Online: 2020-08-07

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2019-0127/html
Scroll to top button