Skip to main content

Advertisement

Log in

A fast imputation algorithm in quantile regression

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In many applications, some covariates could be missing for various reasons. Regression quantiles could be either biased or under-powered when ignoring the missing data. Multiple imputation and EM-based augment approach have been proposed to fully utilize the data with missing covariates for quantile regression. Both methods however are computationally expensive. We propose a fast imputation algorithm (FI) to handle the missing covariates in quantile regression, which is an extension of the fractional imputation in likelihood based regressions. FI and modified imputation algorithms (FIIPW and MIIPW) are compared to existing MI and IPW approaches in the simulation studies, and applied to part of of the National Collaborative Perinatal Project study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Afifi AA, Elashoff RM (1969a) Missing observations in multivariate statistics III. Large sample analysis of simple linear regression. J Am Stat Assoc 64:337–358

    MathSciNet  MATH  Google Scholar 

  • Afifi AA, Elashoff RM (1969b) Missing observations in multivariate statistics IV. A note on simple linear regression. J Am Stat Assoc 64:359–365

    MathSciNet  MATH  Google Scholar 

  • Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–972

    Article  MathSciNet  MATH  Google Scholar 

  • Bassett GW, Chen H (2001) Portfolio style: return-based attribution using quantile regression. Empir Econ 26:293–305

    Article  Google Scholar 

  • Cao W, Tstiatis AA, Daviadian M (2009) Improving efficiency and robustness of doubly robust estimator for a population mean with incomplete data. Biometrika 96:723–734

    Article  MathSciNet  Google Scholar 

  • Chen YH, Chatterjee N, Carroll RJ (2009) Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies. J Am Stat Assoc 104:220–233

    Article  MathSciNet  MATH  Google Scholar 

  • Graham BS, Pinto C, Egel D (2012) Inverse probability tilting for moment condition models with missing data. Rev Econ Stud 79:1052–1079

    Article  MathSciNet  Google Scholar 

  • Hall P, Sheather S (1988) On the distribution of a studentized quantile. J R Stat Soc Ser B 50:381–391

    MathSciNet  MATH  Google Scholar 

  • He X, Shao QM (1996) A general Bahadur representation of M-estimators and its application to linear regression with nonstochastic designs. Ann Stat 24:2608–2630

    Article  MathSciNet  MATH  Google Scholar 

  • Hendricks WO, Koenker R (1992) Hierarchical spline models for conditional quantiles and the demand for electricity. J Am Stat Assoc 87:58–68

    Article  Google Scholar 

  • Hirano K, Guido WI (2001) Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res 2:259–278

    Article  Google Scholar 

  • Kim JK (2011) Parametric fractional imputation for missing data analysis. Biometrika 98:119–132

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91:74–89

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Koenker R, Bassett GJ (1978) Regression quantiles. Econometrica 46:33–50

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R, Machado JAF (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310

    Article  MathSciNet  MATH  Google Scholar 

  • Kottas A, Gelfand AE (2001) Bayesian semiparametric median regression model. J Am Stat Assoc 96:1458–1468

    Article  MATH  Google Scholar 

  • Lipsitz SR, Fitzmaurice GM, Molenberghs G, Zhao LP (1997) Quantile regression methods for longitudinal data with drop-outs: application to CD4 cell counts of patients infected with the human immunodeficiency virus. Ann Appl Stat 46:463–476

    MATH  Google Scholar 

  • Little RJA (1992) Regression with missing X’s: a review. J Am Stat Assoc 87:1227–1237

    Google Scholar 

  • Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York

    MATH  Google Scholar 

  • Pornoy S, Koenker R (1997) The Gaussian hare and the laplacian tortoise: computability of square-error versus absolte-error estimator. Stat Sci 12:279–300

    Article  Google Scholar 

  • Qin J, Leung DHY, Zhang B (2017) Efficient augmented inverse probability weighted estimation in missing data problems. J Bus Econ Stat 35:86–97

    Article  MathSciNet  Google Scholar 

  • Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coeffcients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  MATH  Google Scholar 

  • Robins JM, Rotnitzky A, Zhao LP (1995) Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–121

    Article  MathSciNet  MATH  Google Scholar 

  • Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7:147–177

    Article  Google Scholar 

  • Seaman SR, White IR (2011) Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 22:278–295

    Article  MathSciNet  Google Scholar 

  • Subar AF, Thompson FE, Kipins V, Midthune D, Hurwitz P, Mcnutt S, Mcintosh A, Rosenfeld S (2001) Comparative validation of the Block, Willett, and National Cancer Institute food frequency questionnaires: the Eating at Americans Table Study. Am J Epidemiol 154:1089–1099

    Article  Google Scholar 

  • Tan ZQ (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682

    Article  MathSciNet  MATH  Google Scholar 

  • Terry MB, Wei Y, Essenman D (2007) Maternal, birth, and early life influences on adult body size in women (with comments). Am J Epidemiol 166:5–13

    Article  Google Scholar 

  • Tstiatis AA (2006) Semiparametric theory and missing data. Springer series in statistics, Springer, New York

    Google Scholar 

  • Uysal SD (2015) Doubly robust estimation of causal effects with multivalued treatments: an application to the returns to schooling. J Appl Econom 30:763–786

    Article  MathSciNet  Google Scholar 

  • Wei Y (2008) An approach to multivariate covariate-dependent quantile contours with application to bivariate conditional growth charts. J Am Stat Assoc 103:397–409

    Article  MathSciNet  MATH  Google Scholar 

  • Wei Y, Carroll RJ (2009) Quantile regression with measurement error. J Am Stat Assoc 104:1129–1143

    Article  MathSciNet  MATH  Google Scholar 

  • Wei Y, Yang YK (2014) Quantile regression with covariates missing at random. Stat Sin 24:1277–1299

    MathSciNet  MATH  Google Scholar 

  • Wei Y, Ma Y, Carroll RJ (2012) Multiple imputation in quantile regression. Biometrika 99:423–438

    Article  MathSciNet  MATH  Google Scholar 

  • Wei Y, Song XY, Liu ML, Ionita-Laza I (2016) Secondary case-control quantile analysis with applications to GWAS. J Am Stat Assoc 111:344–354

    Article  Google Scholar 

  • Welsh AH (1988) Asymptotically efficient estimation of the sparsity function at a point. Stat Probab Lett 6:427–432

    Article  MathSciNet  MATH  Google Scholar 

  • Wooldridge JM (2007) Inverse probability weighted estimation for general missing data problems. J Econom 141:1281–1301

    Article  MathSciNet  MATH  Google Scholar 

  • Yi GY, He W (2009) Median regression models for longitudinal data with dropouts. Biometrics 65:618–625

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are thankful to the helpful discussions with Prof. Jae Kwang Kim. The authors gratefully acknowledge NIH awards R01 HG008980 and R03 HG007443, and NSF award DMS-120923.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Cheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, H., Wei, Y. A fast imputation algorithm in quantile regression. Comput Stat 33, 1589–1603 (2018). https://doi.org/10.1007/s00180-018-0813-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-018-0813-z

Keywords

Navigation