A fast imputation algorithm in quantile regression

Cheng, Hao; Wei, Ying

doi:10.1007/s00180-018-0813-z

A fast imputation algorithm in quantile regression

Original Paper
Published: 15 May 2018

Volume 33, pages 1589–1603, (2018)
Cite this article

Computational Statistics Aims and scope Submit manuscript

713 Accesses
4 Citations
Explore all metrics

Abstract

In many applications, some covariates could be missing for various reasons. Regression quantiles could be either biased or under-powered when ignoring the missing data. Multiple imputation and EM-based augment approach have been proposed to fully utilize the data with missing covariates for quantile regression. Both methods however are computationally expensive. We propose a fast imputation algorithm (FI) to handle the missing covariates in quantile regression, which is an extension of the fractional imputation in likelihood based regressions. FI and modified imputation algorithms (FIIPW and MIIPW) are compared to existing MI and IPW approaches in the simulation studies, and applied to part of of the National Collaborative Perinatal Project study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple imputation for handling missing outcome data when estimating the relative risk

Article Open access 06 September 2017

Outcome-sensitive multiple imputation: a simulation study

Article Open access 09 January 2017

The effect of high prevalence of missing data on estimation of the coefficients of a logistic regression model when using multiple imputation

Article Open access 18 July 2022

References

Afifi AA, Elashoff RM (1969a) Missing observations in multivariate statistics III. Large sample analysis of simple linear regression. J Am Stat Assoc 64:337–358
MathSciNet MATH Google Scholar
Afifi AA, Elashoff RM (1969b) Missing observations in multivariate statistics IV. A note on simple linear regression. J Am Stat Assoc 64:359–365
MathSciNet MATH Google Scholar
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–972
Article MathSciNet MATH Google Scholar
Bassett GW, Chen H (2001) Portfolio style: return-based attribution using quantile regression. Empir Econ 26:293–305
Article Google Scholar
Cao W, Tstiatis AA, Daviadian M (2009) Improving efficiency and robustness of doubly robust estimator for a population mean with incomplete data. Biometrika 96:723–734
Article MathSciNet Google Scholar
Chen YH, Chatterjee N, Carroll RJ (2009) Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies. J Am Stat Assoc 104:220–233
Article MathSciNet MATH Google Scholar
Graham BS, Pinto C, Egel D (2012) Inverse probability tilting for moment condition models with missing data. Rev Econ Stud 79:1052–1079
Article MathSciNet Google Scholar
Hall P, Sheather S (1988) On the distribution of a studentized quantile. J R Stat Soc Ser B 50:381–391
MathSciNet MATH Google Scholar
He X, Shao QM (1996) A general Bahadur representation of M-estimators and its application to linear regression with nonstochastic designs. Ann Stat 24:2608–2630
Article MathSciNet MATH Google Scholar
Hendricks WO, Koenker R (1992) Hierarchical spline models for conditional quantiles and the demand for electricity. J Am Stat Assoc 87:58–68
Article Google Scholar
Hirano K, Guido WI (2001) Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res 2:259–278
Article Google Scholar
Kim JK (2011) Parametric fractional imputation for missing data analysis. Biometrika 98:119–132
Article MathSciNet MATH Google Scholar
Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91:74–89
Article MathSciNet MATH Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Koenker R, Bassett GJ (1978) Regression quantiles. Econometrica 46:33–50
Article MathSciNet MATH Google Scholar
Koenker R, Machado JAF (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310
Article MathSciNet MATH Google Scholar
Kottas A, Gelfand AE (2001) Bayesian semiparametric median regression model. J Am Stat Assoc 96:1458–1468
Article MATH Google Scholar
Lipsitz SR, Fitzmaurice GM, Molenberghs G, Zhao LP (1997) Quantile regression methods for longitudinal data with drop-outs: application to CD4 cell counts of patients infected with the human immunodeficiency virus. Ann Appl Stat 46:463–476
MATH Google Scholar
Little RJA (1992) Regression with missing X’s: a review. J Am Stat Assoc 87:1227–1237
Google Scholar
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
MATH Google Scholar
Pornoy S, Koenker R (1997) The Gaussian hare and the laplacian tortoise: computability of square-error versus absolte-error estimator. Stat Sci 12:279–300
Article Google Scholar
Qin J, Leung DHY, Zhang B (2017) Efficient augmented inverse probability weighted estimation in missing data problems. J Bus Econ Stat 35:86–97
Article MathSciNet Google Scholar
Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coeffcients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Article MATH Google Scholar
Robins JM, Rotnitzky A, Zhao LP (1995) Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–121
Article MathSciNet MATH Google Scholar
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7:147–177
Article Google Scholar
Seaman SR, White IR (2011) Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 22:278–295
Article MathSciNet Google Scholar
Subar AF, Thompson FE, Kipins V, Midthune D, Hurwitz P, Mcnutt S, Mcintosh A, Rosenfeld S (2001) Comparative validation of the Block, Willett, and National Cancer Institute food frequency questionnaires: the Eating at Americans Table Study. Am J Epidemiol 154:1089–1099
Article Google Scholar
Tan ZQ (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682
Article MathSciNet MATH Google Scholar
Terry MB, Wei Y, Essenman D (2007) Maternal, birth, and early life influences on adult body size in women (with comments). Am J Epidemiol 166:5–13
Article Google Scholar
Tstiatis AA (2006) Semiparametric theory and missing data. Springer series in statistics, Springer, New York
Google Scholar
Uysal SD (2015) Doubly robust estimation of causal effects with multivalued treatments: an application to the returns to schooling. J Appl Econom 30:763–786
Article MathSciNet Google Scholar
Wei Y (2008) An approach to multivariate covariate-dependent quantile contours with application to bivariate conditional growth charts. J Am Stat Assoc 103:397–409
Article MathSciNet MATH Google Scholar
Wei Y, Carroll RJ (2009) Quantile regression with measurement error. J Am Stat Assoc 104:1129–1143
Article MathSciNet MATH Google Scholar
Wei Y, Yang YK (2014) Quantile regression with covariates missing at random. Stat Sin 24:1277–1299
MathSciNet MATH Google Scholar
Wei Y, Ma Y, Carroll RJ (2012) Multiple imputation in quantile regression. Biometrika 99:423–438
Article MathSciNet MATH Google Scholar
Wei Y, Song XY, Liu ML, Ionita-Laza I (2016) Secondary case-control quantile analysis with applications to GWAS. J Am Stat Assoc 111:344–354
Article Google Scholar
Welsh AH (1988) Asymptotically efficient estimation of the sparsity function at a point. Stat Probab Lett 6:427–432
Article MathSciNet MATH Google Scholar
Wooldridge JM (2007) Inverse probability weighted estimation for general missing data problems. J Econom 141:1281–1301
Article MathSciNet MATH Google Scholar
Yi GY, He W (2009) Median regression models for longitudinal data with dropouts. Biometrics 65:618–625
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are thankful to the helpful discussions with Prof. Jae Kwang Kim. The authors gratefully acknowledge NIH awards R01 HG008980 and R03 HG007443, and NSF award DMS-120923.

Author information

Authors and Affiliations

National Academy of Innovation Strategy, China Association for Science and Technology, Beijing, China
Hao Cheng
School of Statistics, Renmin University of China, Beijing, China
Hao Cheng
Department of Biostatistics, Columbia University, New York, USA
Hao Cheng & Ying Wei

Authors

Hao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Cheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, H., Wei, Y. A fast imputation algorithm in quantile regression. Comput Stat 33, 1589–1603 (2018). https://doi.org/10.1007/s00180-018-0813-z

Download citation

Received: 21 January 2017
Accepted: 10 May 2018
Published: 15 May 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00180-018-0813-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast imputation algorithm in quantile regression

Abstract

Access this article

Similar content being viewed by others

Multiple imputation for handling missing outcome data when estimating the relative risk

Outcome-sensitive multiple imputation: a simulation study

The effect of high prevalence of missing data on estimation of the coefficients of a logistic regression model when using multiple imputation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast imputation algorithm in quantile regression

Abstract

Access this article

Similar content being viewed by others

Multiple imputation for handling missing outcome data when estimating the relative risk

Outcome-sensitive multiple imputation: a simulation study

The effect of high prevalence of missing data on estimation of the coefficients of a logistic regression model when using multiple imputation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation