Skip to main content
Log in

Unified approach for regression models with nonmonotone missing at random data

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Unified approach (Chen and Chen in J R Stat Soc B 62(3):449–460, 2000) uses a working regression model to extract information from auxiliary variables in two-stage study for computing an efficient estimator of regression parameter. As far as we know, the method is limited to deal with missing complete at random data in a simple monotone missing data pattern. In this research, we extend the unified approach to estimate regression models with nonmonotone missing at random data. We describe an inverse probability weighting estimator condition on estimators from a set of working regression models which contains information from incomplete data and auxiliary variables. The proposed method is flexible and can easily accommodate incomplete data and auxiliary variables. We investigate the finite-sample performance of the proposed estimators using simulation studies and further illustrate the estimation method on a case–control study investigating the risk factors of hip fractures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Barengolts, E., Karanouh, D., Kolodny, L., Kukreja, S.: Risk factors for hip fractures in predominantly african-american veteran male population. J. Bone Miner. Res. 16, S170 (2001)

    Google Scholar 

  • Breslow, N.E., Lumley, T., Ballantyne, C.M., Chambless, L.E., Kulich, M.: Improved Horvitz-Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat. Biosciences 1, 32–49 (2009)

    Article  Google Scholar 

  • Breunig, C., Haan, P.: Nonparametric regression with selectively missing covariates. arXiv:1810.00411v2 [econ.EM], 1–37 (2019)

  • Breunig, C., Mammen, E., Simoni, A.: Nonparametric estimation in case of endogenous selection. J. Econom. 202, 268–285 (2018)

    Article  MathSciNet  Google Scholar 

  • Chatterjee, N., Chen, Y., Breslow, N.E.: A pseudo-score estimator for regression problems with two-phase sampling. J. Am. Stat. Assoc. 98, 158–168 (2003)

    Article  Google Scholar 

  • Chatterjee, N., Li, Y.: Inference in semiparametric regression models under partial questionnaire design and nonmonotone missing data. J. Am. Stat. Assoc. 105, 787–797 (2010)

    Article  MathSciNet  Google Scholar 

  • Chen, H.Y.: Nonparametric and semiparametric models for missing covariates in parametric regression. J. Am. Stat. Assoc. 99, 1176–1189 (2004)

    Article  MathSciNet  Google Scholar 

  • Chen, H.Y., Xie, H., Qian, Y.: Multiple imputation for missing values through conditional semiparametric odds ratio models. Biometrics 67, 799–809 (2011)

    Article  MathSciNet  Google Scholar 

  • Chen, Y.H., Chen, H.: A unified approach to regression analysis under double-sampling designs. J. R. Stat. Soc. B 62(3), 449–460 (2000)

    Article  MathSciNet  Google Scholar 

  • Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G.: Longitudinal data analysis. Chapman and Hall/CRC, Boca Raton (2009)

    MATH  Google Scholar 

  • Han, P.: Multiply robust estimation in regression analysis with missing data. J. Am. Stat. Assoc. 109, 1159–1173 (2014)

    Article  MathSciNet  Google Scholar 

  • Horvitz, D.G., Thompson, D.J.: A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47, 663–685 (1952)

    Article  MathSciNet  Google Scholar 

  • Ibrahim, J.G., Chen, M.H., Lipsitz, S.R., Herring, A.H.: Missing-data methods for generalized linear models: A comparative review. J. Am. Stat. Assoc. 100, 332–346 (2005)

    Article  MathSciNet  Google Scholar 

  • van der Laan, M.J., Robins, J.M.: Unified Methods for Censored Longitudinal Data and Causality. Springer-Verlag, New York (2003)

    Book  Google Scholar 

  • Lawless, J.F., Kalbfleisch, J.D., Wild, C.J.: Semiparametric methods for response-selective and missing data problems in regression. J. Royal Stat. Soc. B 61(2), 413–438 (1999)

    Article  MathSciNet  Google Scholar 

  • Lipsitz, S.R., Ibrahim, J.G.: A conditional model for incomplete covariates in parametric regression models. Biometrika 83(4), 916–922 (1996)

    Article  Google Scholar 

  • Lipsitz, S.R., Ibrahim, J.G., Zhao, L.: A weighted estimating equation for missing covariate data with properties similar to maximum likelihood. J. Am. Stat. Assoc. 94, 1147–1160 (1999)

    Article  MathSciNet  Google Scholar 

  • Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2002)

    Book  Google Scholar 

  • Robins, J.M., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)

    Article  MathSciNet  Google Scholar 

  • Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)

    Article  MathSciNet  Google Scholar 

  • Rubin, D.B.: Multiple Imputationfor Nonresponse in Surveys. Wiley, New York (1987)

    Book  Google Scholar 

  • Rubin, D.B.: Multiple imputation after 18+ years. J. Am. Stat. Assoc. 91, 473–489 (1996)

    Article  Google Scholar 

  • Scheuren, F.: Multiple imputation: how it began and continues. J. Am. Stat. Assoc. 59, 315–319 (2005)

    Article  MathSciNet  Google Scholar 

  • Sun, B., Tchetgen, E.J.T.: On inverse probability weighting for nonmonotone missing at random data. J. Am. Stat. Assoc. 113, 369–379 (2018)

    Article  MathSciNet  Google Scholar 

  • Tsiatis, A.: Semiparametric Theory and Missing Data. Springer, New York (2006)

    MATH  Google Scholar 

  • Wacholder, S., Carroll, R.J., Pee, D., Gail, M.G.: The partial questionnaire design for case-control studies. Stat. Med. 13, 623–634 (1994)

    Article  Google Scholar 

  • Zhao, L.P., Lipsitz, S.: Designs and analysis of two-stage studies. Stat. Med. 11, 769–782 (1992)

    Article  Google Scholar 

  • Zhao, Y.: Statistical inference for missing data mechanisms. Stat. Med. (2020). https://doi.org/10.1002/sim.8727

    Article  MathSciNet  Google Scholar 

  • Zhao, Y., Lawless, J.F., McLeish, D.L.: Likelihood methods for regression models with expensive variables missing by design. Biometrical J. 51, 123–136 (2009)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank Professor Donald L. McLeish, Professor Jerald F. Lawless, the associate editor, and the anonymous reviewers for their helpful comments and suggestions. We are grateful to Professor Hua Yun Chen for letting us use the hip fracture data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was partially supported by grant from the Natural Sciences and Engineering Research Council of Canada (YZ).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Liu, M. Unified approach for regression models with nonmonotone missing at random data. AStA Adv Stat Anal 105, 87–101 (2021). https://doi.org/10.1007/s10182-020-00389-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-020-00389-y

Keywords

Navigation