Abstract
Model structural inference on semiparametric measurement error models have not been well developed in the existing literature, partially due to the difficulties in dealing with unobservable covariates. In this study, a framework for adaptive structure selection is developed in partially linear error-in-function models with error-prone covariates. Firstly, based on the profile-least-square estimators of the current models, we define two test statistics via generalized likelihood ratio (GLR) test method (Fan et al. in Ann Stat 29(1):153–193, 2001). The proposed test statistics are shown to possess the Wilks-type properties, and a class of new Wilks phenomenon is unveiled in the family of semiparametric measurement error models. Then, we demonstrate that the GLR statistics asymptotically follow chi-squared distributions under null hypotheses. Further, we propose efficient algorithms to implement our methodology and assess the finite sample performance by simulated examples. A real example is given to illustrate the performance of the present methodology.
Similar content being viewed by others
References
Apanasovich, T. V., Carroll, R. J., & Maity, A. (2009). SIMEX and standard error estimation in semiparametric measurement error models. Electronic Journal of Statistics, 3, 318–348.
Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. M. (2006). Measurement error in nonlinear models: A modern perspective. Boca Raton: Chapman and Hall/CRC.
Carroll, R. J., Fan, J. Q., Gijbels, I., & Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92(438), 477–489.
Chen, X., & Cui, H. J. (2012). Empirical likelihood inference for parameters in a partially linear errors-in-variables model. Statistics: A Journal of Theoretical and Applied Statistics, 46(6), 745–757.
Cui, H. J., & Kong, E. (2006). Empirical likelihood confidence region for parameters in semi-linear errors-in-variables models. Scandinavian Journal of Statistics, 33(1), 153–168.
Cui, H. J., & Li, R. C. (1998). On parameter estimation for semi-linear errors-in-variables models. Journal of Multivariate Analysis, 64, 1–24.
De Jong, P. (1987). A central limit theorem for generalized quadratic forms. Probability Theory and Related Fields, 75(2), 261–277.
Engle, R. F., Granger, C. W. J., Rice, J., & Weiss, A. (1986). Semiparametric estimates of the relation between weather and electricity sales. Journal of the American Statistical Association, 81(394), 310–320.
Fan, J. Q., & Gijbels, I. (1996). Local polynomial modelling and its applications. Boca Raton: Chapman and Hall/CRC.
Fan, J. Q., & Jiang, J. C. (2005). Nonparametric inferences for additive models. Journal of the American Statistical Association, 100(471), 890–907.
Fan, J. Q., & Jiang, J. C. (2007). Nonparametric inference with generalized likelihood ratio tests. Test, 16(3), 409–444.
Fan, J. Q., & Truong, Y. K. (1993). Nonparametric regression with errors in variables. The Annals of Statistics, 21(4), 1900–1925.
Fan, J. Q., Zhang, C. M., & Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon. The Annals of Statistics, 29(1), 153–193.
Hall, P., & Ma, Y. Y. (2007). Testing the suitability of polynomial models in errors-in-variables problems. The Annals of Statistics, 35(6), 2620–2638.
Härdle, W., Liang, H., & Gao, J. (2000). Partially linear models. Heidelberg: Physica.
Hinkley, D., & Schechtman, E. (1987). Conditional bootstrap methods in the mean-shift model. Biometrika, 74(1), 85–93.
Huang, Z. S. (2012). Empirical likelihood for the parametric part in partially linear errors-in-function models. Statistics and Probability Letters, 82(1), 63–66.
Huang, Z. S., & Ding, H. Y. (2017). Statistical estimation for partially linear error-in-variable models with error-prone covariates. Communications in Statistics-Simulation and Computation, 46(8), 6559–6573.
Huang, Z. S., Pang, Z., & Hu, T. (2013). Testing structural change in partially linear single-index models with error-prone linear covariates. Computational Statistics and Data Analysis, 59, 121–133.
Jiang, J. C., Zhou, H. B., Jiang, X. J., & Peng, J. A. (2007). Generalized likelihood ratio tests for the structure of semiparametric additive models. The Canadian Journal of Statistics, 35(3), 381–398.
Li, R. Z., & Liang, H. (2008). Variable selection in semiparametric regression modeling. The Annals of Statistics, 36(1), 261–286.
Li, X. L., You, J. H., & Zhou, Y. (2011). Statistical inference for varying-coefficient models with error-prone covariates. Journal of Statistical Computation and Simulation, 81(12), 1755–1771.
Liang, H. (2000). Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part. Journal of Statistical Planning and Inference, 86(1), 51–62.
Liang, H. (2006). Estimation in partially linear models and numerical comparisons. Computational Statistics and Data Analysis, 50(3), 675–687.
Liang, H., Wang, S. J., & Carroll, R. J. (2007). Partially linear models with missing response variables and error-prone covariates. Biometrika, 94(1), 185–198.
Liu, J. X., & Ma, Y. Y. (2019). Locally efficient semiparametric estimators for a class of Poisson models with measurement error. The Canadian Journal of Statistics, 47, 157–181. (in press).
Liu, J. X., Ma, Y. Y., Zhu, L. P., & Carroll, R. J. (2017). Estimation and inference of error-prone covariate effect in the presence of confounding variables. Electronic Journal of Statistics, 11(1), 480–501.
Ma, Y. Y., & Carroll, R. J. (2006). Locally efficient estimators for semiparametric models with measurement error. Journal of the American Statistical Association, 101(476), 1465–1474.
Müller, S., & Vial, C. (2009). Partially linear model selection by the bootstrap. Australian and New Zealand Journal of Statistics, 51(2), 183–200.
Rao, J. N. K., & Scott, A. J. (1981). The analysis of categorical data from complex sample surveys: Chi-squared tests for goodness of fit and independence in two-way tables. Journal of the American Statistical Association, 76(374), 221–230.
Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. New York: Cambridge University Press.
Speckman, P. (1988). Kernel smoothing in partial linear models. Journal of the Royal Statistical Society. Series B (Methodological), 50(3), 413–436.
Wahba, G. (1984). Partial spline models for semiparametric estimation of functions of several variables. In Statistical Analysis of Time Series. Proceedings of the Japan U.S. Joint Seminar, Tokyo (pp. 319–329). Tokyo: Institute of Statistical Mathematics.
Wang, L. F., Li, H. Z., & Huang, J. H. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. Journal of the American Statistical Association, 103(484), 1556–1569.
Xue, L. G., & Liu, Q. (2010). Bootstrap approximation of wavelet estimates in a semiparametric regression model. Acta Mathematica Sinica, English Series, 26(4), 763–778.
Yan, L., & Chen, X. (2014). Empirical likelihood for partly linear models with errors in all variables. Journal of Multivariate Analysis, 130, 275–288.
You, J. H., & Zhou, X. (2005). Bootstrap of a semiparametric partially linear model with autoregressive errors. Statistica Sinica, 15, 117–133.
Zhang, J., Feng, Z. H., Xu, P. R., & Liang, H. (2017). Generalized varying coefficient partially linear measurement errors models. Annals of the Institute of Statistical Mathematics, 69(1), 97–120.
Zhang, R. Q. (2007). Tests for nonparametric parts on partially linear single index models. Science in China Series A: Mathematics, 50(3), 439–449.
Zhou, Y., & Liang, H. (2009). Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates. The Annals of Statistics, 37(1), 427–458.
Zhu, L. X., & Cui, H. J. (2003). A semi-parametric regression model with errors in variables. Scandinavian Journal of Statistics, 30(2), 429–442.
Acknowledgements
This research was supported by the National Natural Science Foundation of China (Grant Nos. 11471160, 11101114), the National Statistical Science Research Major Program of China (Grant No. 2018LD01), the Fundamental Research Funds for the Central Universities (Grant No. 30920130111015), the Jiangsu Provincial Basic Research Program (Natural Science Foundation) (Grant No. BK20131345) and sponsored by Qing Lan Project. The authors thank the Editor, an Associate Editor and two referees for their constructive comments, which led to significant improvements of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Before demonstrating proofs of the main theorems introduced in Sect. 2, we now give the following conditions used in the paper.
Condition(A)
- \((A_{1})\) :
-
Nonparametric function \(g(\cdot )\) has the Lipschitz continuous second derivative \(g''(\cdot )\).
- \((A_{2})\) :
-
The covariates \(\xi \) and T have a joint density \(p(\xi ,t)\). Further, the marginal density \(f_{T}(t)\) of T is compactly supported, bounded, Lipschitz continuous and bounded away from 0. T has a bounded support \(\mathscr {T}\) and \(\xi \) has a bounded support.
- \((A_{3})\) :
-
The kernel functions \(K(\cdot )\) and \(L(\cdot )\) are symmetric density functions with compact support \([-1,1]\), and they are bounded, nonnegative and Lipschitz continuous.
- \((A_{4})\) :
-
\(E(\xi |T)\), \(E(\xi \xi ^{T}|T)\) and \(E((\xi \xi ^{T})*(\xi \xi ^{T})|T)\) are Lipschitz continuous, where \(G*H\) denotes the Hadamard product of matrices G and H.
- \((A_{5})\) :
-
\(E(\varepsilon ^{4}_{i})<\infty \).
- \((A_{6})\) :
-
The marginal density \(f_{T}(t)\) of T has the bounded kth derivative, where k is a positive integer. The random variable X has a bounded support \(\mathscr {X}\). The density function \(f_{X}(x)\) of X is bounded away from 0 on \(\mathscr {X}\).
- \((A_{7})\) :
-
For each \(T \in \mathscr {T}\), the matrix \(E(\xi \xi ^{T}|T)\) is nonsingular and the matrix \(B=~E(V_{1}V_{1}^{T})\) is positive definite.
- \((A_{8})\) :
-
Let \(h_{j}(t)=E(\hat{\xi }_{ij}|T_i=t),\hat{\xi }_{ij}=\hat{\xi }_j(X_i),1\le i\le n,1\le j\le p\). Both \(g(\cdot )\) and \(h_{j}(\cdot )\) are Lipschitz continuous of order 1.
- \((A_{9})\) :
-
There is a \(d>2\) such that \(E|\xi |^{2d}<\infty \), \(E|X|^{2d}<\infty \), and for \(\delta <2-d^{-1}\), \(n^{2\delta -1}h\rightarrow \infty \), \(n^{2\delta -1}b_{k}\rightarrow \infty \), \(nhb_{k}^{2(r+1)}\rightarrow 0\), where \(\{b_{k}\}^p_{k=1}\) is the bandwidth sequence.
- \((A_{10})\) :
-
\(nh^{8}\rightarrow 0\), \(nh^{2}/(\log n)^{2}\rightarrow \infty \).
- \((A_{11})\) :
-
The \(\phi _{v}(\cdot )\) served as the characteristic function of v is not identically zero. And it is assumed to be ordinary smooth or super smooth.
Lemma
Suppose Conditions \((A_1)\)–\((A_4)\) hold, then, uniformly in \(t\in \mathscr {T}\), we have
under \(H_{0}\), \(t\in \mathscr {T}\), \(h\rightarrow 0,~nh^{3/2}\rightarrow \infty \), thus
where \(\nu _{2}=\int s^2K(s)ds\), \(e(t)=\frac{1}{n}\sum ^n_{k=1}\frac{1}{f_T(t)}K_{h}(T_k-t) \hat{\varepsilon }_k\).
The lemma obtained from Carroll et al. (1997) shows the asymptotic property of \(g(\cdot )\) when \(\beta \) is estimated at the parametric rate. The asymptotic behaviour \(\hat{\beta }_n-\beta =O_p(n^{-1/2})\) can be inferred naturally from Huang and Ding (2017).
Proof of Theorem 1
The proof of Theorem 1 is mainly based on the conclusion of Zhang (2007). Represent the GLR statistic for testing problem (3) as
the numerator part of \(\lambda _{ng}\) can be inferred as
We prove the numerator formula (A.2) in the first place. By the following simple computations we can derive the correlative residuals sum of squares,
Analogously, we can get
Let \(A=g(W_i)-\hat{g}_n(W_i)\), \(B=(\beta -\hat{\beta }_n)^T\hat{\xi }_i\), \(C=\hat{\varepsilon }_i\), by (A.1), we have
The previously mentioned formula \(\hat{\beta }_n-\beta =O_p(n^{-1/2})\) concerned with asymptotic property of \(\beta \) results in
Applying the Kolmogorov’s strong law of large numbers, we can obtain
Then,
The combination of the expressions of \(RSS_{0g}\) and \(RSS_1\) accounts for the establishment of the formula (A.2). According to Zhang (2007), we further deduce
where,
To demonstrate the formula (A.3), we combine the arguments of Zhang (2007) and (A.1), then
and,
where,
Therefore, the numerator part (A.3) of the GLR statistic \(\lambda _{ng}\) is vindicated. By Zhang (2007), the variance of \(W_n\) can be written as
where \(\sigma ^2_n=\frac{2|\mathscr {T}|}{h}\int (K(s)-\frac{1}{2}K*K(s))^2ds\). It implies that
Conjoining with (A.3) with (A.4), we obtain the denominator part of \(\lambda _{ng}\),
We combine the numerator part (A.3) and the denominator part (A.5) of the statistic to derive
De Jong (1987) demonstrated that \(W_n\) is asymptotically normal,
then we can get
This completes the proof of Theorem 1. \(\square \)
Proof of Theorem 2 and 3
Primarily, we have (A.5): \(n^{-1}RSS_{1}=\sigma ^{2}(1+o_{P}(1))\), and \(\hat{g}_{i0}=\sum \nolimits _{j=1}^{n}\omega _{nj}(W_{i})(Y_{j}-\hat{\beta }_{0}^{T}\hat{\xi }_{j})\), \(\hat{g}_{in}=\sum \nolimits _{j=1}^{n}\omega _{nj}(W_{i})(Y_{j}-\hat{\beta }^{T}_n\hat{\xi }_{j})\), then
where
Then we can obtain
According to the Lemma 5 in Huang and Ding (2017),
thus,
\(Q_{2},Q_{3}\) are negligible in probability. Combing (A.7) and Slutsky’s theorem, we can get
We can learn from Rao and Scott (1981) that the distribution \(\varrho _{n}\sum \nolimits _{i=1}^{l}\omega _{i}\chi _{i1}^{2}\) is nearly the same as the \(\chi ^2\) distribution with degrees of freedom l. The proof of Theorem 3 is completed. \(\square \)
The Theorem 2 is the special case of the Theorem 3. Therefore, the proof of Theorem 2 can be completed by the similar arguments and the details are omitted.
Rights and permissions
About this article
Cite this article
Ye, Z., Huang, Z. & Ding, H. Adaptive structure inferences on partially linear error-in-function models with error-prone covariates. J. Korean Stat. Soc. 49, 177–199 (2020). https://doi.org/10.1007/s42952-019-00012-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-019-00012-0