Abstract
In many applications, some covariates could be missing for various reasons. Regression quantiles could be either biased or under-powered when ignoring the missing data. Multiple imputation and EM-based augment approach have been proposed to fully utilize the data with missing covariates for quantile regression. Both methods however are computationally expensive. We propose a fast imputation algorithm (FI) to handle the missing covariates in quantile regression, which is an extension of the fractional imputation in likelihood based regressions. FI and modified imputation algorithms (FIIPW and MIIPW) are compared to existing MI and IPW approaches in the simulation studies, and applied to part of of the National Collaborative Perinatal Project study.
Similar content being viewed by others
References
Afifi AA, Elashoff RM (1969a) Missing observations in multivariate statistics III. Large sample analysis of simple linear regression. J Am Stat Assoc 64:337–358
Afifi AA, Elashoff RM (1969b) Missing observations in multivariate statistics IV. A note on simple linear regression. J Am Stat Assoc 64:359–365
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–972
Bassett GW, Chen H (2001) Portfolio style: return-based attribution using quantile regression. Empir Econ 26:293–305
Cao W, Tstiatis AA, Daviadian M (2009) Improving efficiency and robustness of doubly robust estimator for a population mean with incomplete data. Biometrika 96:723–734
Chen YH, Chatterjee N, Carroll RJ (2009) Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies. J Am Stat Assoc 104:220–233
Graham BS, Pinto C, Egel D (2012) Inverse probability tilting for moment condition models with missing data. Rev Econ Stud 79:1052–1079
Hall P, Sheather S (1988) On the distribution of a studentized quantile. J R Stat Soc Ser B 50:381–391
He X, Shao QM (1996) A general Bahadur representation of M-estimators and its application to linear regression with nonstochastic designs. Ann Stat 24:2608–2630
Hendricks WO, Koenker R (1992) Hierarchical spline models for conditional quantiles and the demand for electricity. J Am Stat Assoc 87:58–68
Hirano K, Guido WI (2001) Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res 2:259–278
Kim JK (2011) Parametric fractional imputation for missing data analysis. Biometrika 98:119–132
Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91:74–89
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Koenker R, Bassett GJ (1978) Regression quantiles. Econometrica 46:33–50
Koenker R, Machado JAF (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310
Kottas A, Gelfand AE (2001) Bayesian semiparametric median regression model. J Am Stat Assoc 96:1458–1468
Lipsitz SR, Fitzmaurice GM, Molenberghs G, Zhao LP (1997) Quantile regression methods for longitudinal data with drop-outs: application to CD4 cell counts of patients infected with the human immunodeficiency virus. Ann Appl Stat 46:463–476
Little RJA (1992) Regression with missing X’s: a review. J Am Stat Assoc 87:1227–1237
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Pornoy S, Koenker R (1997) The Gaussian hare and the laplacian tortoise: computability of square-error versus absolte-error estimator. Stat Sci 12:279–300
Qin J, Leung DHY, Zhang B (2017) Efficient augmented inverse probability weighted estimation in missing data problems. J Bus Econ Stat 35:86–97
Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coeffcients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Robins JM, Rotnitzky A, Zhao LP (1995) Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–121
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7:147–177
Seaman SR, White IR (2011) Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 22:278–295
Subar AF, Thompson FE, Kipins V, Midthune D, Hurwitz P, Mcnutt S, Mcintosh A, Rosenfeld S (2001) Comparative validation of the Block, Willett, and National Cancer Institute food frequency questionnaires: the Eating at Americans Table Study. Am J Epidemiol 154:1089–1099
Tan ZQ (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682
Terry MB, Wei Y, Essenman D (2007) Maternal, birth, and early life influences on adult body size in women (with comments). Am J Epidemiol 166:5–13
Tstiatis AA (2006) Semiparametric theory and missing data. Springer series in statistics, Springer, New York
Uysal SD (2015) Doubly robust estimation of causal effects with multivalued treatments: an application to the returns to schooling. J Appl Econom 30:763–786
Wei Y (2008) An approach to multivariate covariate-dependent quantile contours with application to bivariate conditional growth charts. J Am Stat Assoc 103:397–409
Wei Y, Carroll RJ (2009) Quantile regression with measurement error. J Am Stat Assoc 104:1129–1143
Wei Y, Yang YK (2014) Quantile regression with covariates missing at random. Stat Sin 24:1277–1299
Wei Y, Ma Y, Carroll RJ (2012) Multiple imputation in quantile regression. Biometrika 99:423–438
Wei Y, Song XY, Liu ML, Ionita-Laza I (2016) Secondary case-control quantile analysis with applications to GWAS. J Am Stat Assoc 111:344–354
Welsh AH (1988) Asymptotically efficient estimation of the sparsity function at a point. Stat Probab Lett 6:427–432
Wooldridge JM (2007) Inverse probability weighted estimation for general missing data problems. J Econom 141:1281–1301
Yi GY, He W (2009) Median regression models for longitudinal data with dropouts. Biometrics 65:618–625
Acknowledgements
The authors are thankful to the helpful discussions with Prof. Jae Kwang Kim. The authors gratefully acknowledge NIH awards R01 HG008980 and R03 HG007443, and NSF award DMS-120923.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheng, H., Wei, Y. A fast imputation algorithm in quantile regression. Comput Stat 33, 1589–1603 (2018). https://doi.org/10.1007/s00180-018-0813-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-018-0813-z