Abstract
Cross-sectionally sampled data with binary disease outcome are commonly analyzed in observational studies to identify the relationship between covariates and disease outcome. A cross-sectional population is defined as a population of living individuals at the sampling or observational time. It is generally understood that binary disease outcome from cross-sectional data contains less information than longitudinally collected time-to-event data, but there is insufficient understanding as to whether bias can possibly exist in cross-sectional data and how the bias is related to the population risk of interest. Wang and Yang (2021) presented the complexity and bias in cross-sectional data with binary disease outcome with detailed analytical explorations into the data structure. As the distribution of the cross-sectional binary outcome is quite different from the population risk distribution, bias can arise when using cross-sectional data analysis to draw inference for population risk. In this paper we argue that the commonly adopted age-specific risk probability is biased for the estimation of population risk and propose an outcome reassignment approach which reassigns a portion of the observed binary outcome, 0 or 1, to the other disease category. A sign test and a semiparametric pseudo-likelihood method are developed for analyzing cross-sectional data using the OR approach. Simulations and an analysis based on Alzheimer’s Disease data are presented to illustrate the proposed methods.
Similar content being viewed by others
References
Alexander L, Lopes B, Ricchetti-Masterson K, Yeatts KB, ERIC N (2015) Cross-sectional studies. Eric Noteb 2(6):1–5
Banerjee M, Wellner JA (2005) Confidence intervals for current status data. Scand J Stat 32(3):405–424
Cox DR (1972) Regression models and life-tables. J Royal Stat Soc: Series B (Methodoll) 34(2):187–202
Edwards JK, Cole SR, Chu H, Olshan AF, Richardson DB (2014) Accounting for outcome misclassification in estimates of the effect of occupational asbestos exposure on lung cancer death. Am J Epidemiology 179(5):641–647
Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press, Cambridge
Finkelstein DM (1986) A proportional hazards model for interval-censored failure time data. Biometrics 42:845–854
Gilbert R, Martin RM, Donovan J, Lane JA, Hamdy F, Neal DE, Metcalfe C (2016) Misclassification of outcome in case-control studies: methods for sensitivity analysis. Stat Methods Med Res 25(5):2377–2393
Groeneboom P, Wellner JA (1992) Information bounds and nonparametric maximum likelihood estimation, vol 19. Springer Science & Business Media, Heidelberg
Ho DE, Imai K, King G, Stuart EA (2011) MatchIt: nonparametric preprocessing for parametric causal inference. J Stat Softw 42(8):1–28
Jewell NP, van der Laan M (2003) Current status data: review, recent developments and open problems. Handb Stat 23:625–642
Lin D, Oakes D, Ying Z (1998) Additive hazards regression with current status data. Biometrika 85(2):289–298
Mandel M (2015) Analyzing multiple cross-sectional samples with application to hospitalization time after surgeries. Stat Med 34(26):3415–3423
Mandel M, Fluss R (2009) Nonparametric estimation of the probability of illness in the illness-death model under cross-sectional sampling. Biometrika 96(4):861–872
Martinez BAF, Leotti VB, GdSe Silva, Nunes LN, Machado G, Corbellini LG (2017) Odds ratio or prevalence ratio? an overview of reported statistical methods and appropriateness of interpretations in cross-sectional studies with dichotomous outcomes in veterinary medicine. Front Vet Sci 4:193
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2):153–157
Müller M (2001) Estimation and testing in generalized partial linear models? a comparative study. Stat Comput 11(4):299–309
Rossini A, Tsiatis A (1996) A semiparametric proportional odds regression model for the analysis of current status data. J Am Stat Assoc 91(434):713–721
Severini TA, Staniswalis JG (1994) Quasi-likelihood estimation in semiparametric models. J Am stat Assoc 89(426):501–511
Wang MC (1991) Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 86(413):130–143
Wang MC, Yang Y (2021) Complexity and bias in cross-sectional data with binary disease outcome in observational studies. Stat Med 40(4):950–962
Wisniewski T, Castano EM, Golabek A, Vogel T, Frangione B (1994) Acceleration of Alzheimer’s fibril formation by apolipoprotein e in vitro. Am J Pathol 145(5):1030
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, MC., Zhu, Y. Bias correction via outcome reassignment for cross-sectional data with binary disease outcome. Lifetime Data Anal 28, 659–674 (2022). https://doi.org/10.1007/s10985-022-09559-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-022-09559-3