Abstract
In recent years, there has been a large amount of literature on missing data. Most of them focus on situations where there is only missingness in response or covariate.
In this paper, we consider the adequacy check for the linear regression model with the response and covariates missing simultaneously.
We apply model adjustment and inverse probability weighting methods to deal with the missingness of response and covariate, respectively. In order to avoid the curse of dimension, we propose an empirical process test with the linear indicator weighting function. The asymptotic properties of the proposed test under the null, local and global alternative hypothetical models are rigorously investigated. A consistent wild bootstrap method is developed to approximate the critical value.
Finally, simulation studies and real data analysis are performed to show that the proposed method performed well.
Similar content being viewed by others
References
Bierens, H.J., Ploberger, W. Asymptotic theory of integrated conditional moment tests. Econometrica, 65: 1129–1151 (1997)
Chen, B., Yi, G.Y., Cook, R.J. Weighted generalized estimating functions for longitudinal response and covariate data that are missing at random. Journal of the American Statistical Association, 105: 336–353 (2010)
Chen, B., Zhou, X.H. Doubly robust estimates for binary longitudinal data analysis with missing response and missing covariates. Biometrics, 67: 830–842 (2011)
Chen, Q., Ibrahim, J.G., Chen, M.H., Senchaudhuri, P. Theory and inference for regression models with missing responses and covariates. Journal of multivariate analysis, 99: 1302–1331 (2008)
Escanciano, J.C. A consistent diagnostic test for regression models using projections. Econometric Theory, 22: 1030–1051 (2006)
Gonzlez-Manteiga, W., Peréz-González, A. Goodness-of-fit tests for linear regression models with missing response data. Canadian Journal of Statistics, 34: 149–170 (2006)
Guo, X., Wang, T., Zhu, L. Model checking for parametric single-index models: a dimension reduction model-adaptive approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61: 1–23 (2015)
Härdle, W., Mammen, E. Testing parametric versus nonparametric regression. Annals of Statistics, 21: 1926–1947 (1993)
Kosorok, M.R. Introduction to empirical processes and semiparametric inference. Springer, 2007
Li, Q., Racine, J.S. Nonparametric econometrics:theory and practice. Princeton University Press, Princeton, N.J., 2007
Li, Q., Wang, S. A simple consistent bootstrap test for a parametric regression function. Journal of Econometrics, 87: 145–165 (1998)
Lin, D.Y., Wei, L.J., Ying, Z. Model-checking techniques based on cumulative residuals. Biometrics, 58: 1–12 (2002)
Lipsitz, S.R., Ibrahim, J.G., Chen, M.H., Peterson, H. Non-ignorable missing covariates in generalized linear models. Statistics in medicine, 18: 2435–2448 (1999)
Little, R.J., Rubin, D.B. Statistical analysis with missing data. John Wiley and Sons, 2014
Masry, E. Multivariate local polynomial regression for time series: uniform strong consistency and rates. Journal of Time Series Analysis, 17: 571–599 (1996)
Newey, W.K. Maximum likelihood specification testing and conditional moment tests. Econometrica: Journal of the Econometric Society, 53: 1047–1070 (1985)
Niu, C., Guo, X., Xu, W., Zhu, L. Empirical likelihood inference in linear regression with nonignorable missing response. Computational Statistics and Data Analysis, 79: 91–112 (2014)
Robins, J.M., Wang, N. Inference for imputation estimators. Biometrika, 87: 113–114 (2000)
Seaman, S.R., White, I.R. Review of inverse probability weighting for dealing with missing data. Statistical methods in medical research, 22: 278–295 (2013)
Shardell, M., Miller, R.R. Weighted estimating equations for longitudinal studies with death and non-monotone missing time-dependent covariates and outcomes. Statistics in medicine, 27: 1008–1025 (2008)
Stinchcombe, M.B., White, H. Consistent specification testing with nuisance parameters present only under the alternative. Econometric theory, 14: 295–325 (1998)
Stute, W. Nonparametric model checks for regression. The Annals of Statistics, 25: 613–641 (1997)
Stute, W., Manteiga, W.G., Quindimil, M.P. Bootstrap Approximations in Model Checks for Regression. Journal of the American Statistical Association, 93: 141–149 (1998)
Sun, Z., Chen, F., Zhou, X., Zhang, Q. Improved model checking methods for parametric models with responses missing at random. Journal of Multivariate Analysis, 154: 147C–161 (2017)
van der Vaart, A.W., Wellner, J.A. Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag, New York, 1996
Wang, Q., Rao, J. Empirical likelihood-based inference under imputation for missing response data. Annals of statistics, 896–924 (2002)
Zhu, L.X. Nonparametric Monte Carlo tests and their applications. Volume 182 of Lecture Notes in Statistics, Springer, New York, 2005
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by Key projects of philosophy and social science in Beijing (15ZDA47), National Natural Science Foundation of China (Grant Nos.11571340, 11971045), Beijing Natural Science Foundation (1202001) and the Open Project of Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences.
Rights and permissions
About this article
Cite this article
Zheng, Sj., Gao, Sy. & Sun, Zh. Projection-based Consistent Test for Linear Regression Model with Missing Response and Covariates. Acta Math. Appl. Sin. Engl. Ser. 36, 917–935 (2020). https://doi.org/10.1007/s10255-020-0976-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10255-020-0976-6
Keywords
- consistency
- linear indicator weighting function
- empirical process
- missing response and covariates
- projection