Abstract
This paper develops a varying-coefficient approach to the estimation and testing of regression quantiles under randomly truncated data. In order to handle the truncated data, the random weights are introduced and the weighted quantile regression (WQR) estimators for nonparametric functions are proposed. To achieve nice efficiency properties, we further develop a weighted composite quantile regression (WCQR) estimation method for nonparametric functions in varying-coefficient models. The asymptotic properties both for the proposed WQR and WCQR estimators are established. In addition, we propose a novel bootstrap-based test procedure to test whether the nonparametric functions in varying-coefficient quantile models can be specified by some function forms. The performance of the proposed estimators and test procedure are investigated through simulation studies and a real data example.
Similar content being viewed by others
References
Andriyana, Y., Gijbels, I., Verhasselt, A.: P-spline quantile regression estimation in varying coefficient models. Test 23, 153–194 (2014)
Andriyana, Y., Gijbels, I.: Quantile regression in heteroscedastic varying coefficient models. AStA Adv. Stat. Anal. 101, 151–176 (2017)
Cai, Z.W., Fan, J.Q., Yao, Q.W.: Functional-coefficient regression models for nonlinear time series. J. Am. Stat. Assoc. 95, 941–956 (2000)
Fan, J.Q., Huang, T.: Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11, 1031–1057 (2005)
Guo, J., Tian, M.Z., Zhu, K.: New efficient and robust estimation in varying-coefficient models with heteroscedasticity. Stat. Sin. 22, 1075–1101 (2012)
Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. B 55, 757–796 (1993)
He, S.Y., Yang, G.L.: Estimation of the truncation probability in the random truncation model. Ann. Stat. 26, 1011–1027 (1998)
He, S.Y., Yang, G.L.: Estimation of regression parameters with left truncated data. J. Stat. Plan. Inference 117, 99–122 (2003)
Honda, T.: Quantile regression in varying coefficient models. J. Stat. Plan. Inference 121, 113–125 (2004)
Jiang, R., Zhou, Z.G., Qian, W.M., Chen, Y.: Two step composite quantile regression for single-index models. Comput. Stat. Data Anal. 64, 180–191 (2013)
Jiang, R., Qian, W.M., Zhou, Z.G.: Weighted composite quantile regression for single-index models. J. Multivar. Anal. 148, 34–48 (2016)
Kai, B., Li, R.Z., Zou, H.: Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. J. R. Stat. Soc. B 72, 49–69 (2010)
Kai, B., Li, R.Z., Zou, H.: New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Stat. 39, 305–332 (2011)
Kim, M.O.: Quantile regression with varying coefficients. Ann. Stat. 35, 92–108 (2007)
Knight, K.: Limiting distributions for \(l_1\) regression estimators under general conditions. Ann. Stat. 26, 755–770 (1998)
Koenker, R.: Econometric Society Monographs: Quantile Regression. Cambridge Press, Cambridge (2005)
Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46, 33–50 (1978)
Lemdani, M., Ould-Saïd, E., Poulin, P.: Asymptotic properties of a conditional quantile estimator with randomly truncated data. J. Multivar. Anal. 100, 546–559 (2009)
Liang, H.Y., Baek, J.I.: Asymptotic normality of conditional density estimation with left-truncated and dependent data. Stat. Papers 57, 1–20 (2016)
Liang, H.Y., Liu, A.A.: Kernel estimation of conditional density with truncated, censored and dependent data. J. Multivar. Anal. 120, 40–58 (2013)
Luo, S., Mei, C., Zhang, C.Y.: Smoothed empirical likelihood for quantile regression models with response data missing at random. AStA Adv. Stat. Anal. 101, 95–116 (2017)
Lv, Y.H., Zhang, R.Q., Zhao, W.H., Liu, J.C.: Quantile regression and variable selection of partial linear single-index model. Ann. Inst. Stat. Math. 67, 375–409 (2015)
Lynden-Bell, D.: A method of allowing for known observational selection in small samples applied to 3CR quasars. Month. Not. R. Astron. Soc. 155, 95–118 (1971)
Mack, Y.P., Silverman, B.W.: Weak and strong uniform consistency of kernel regression estimators. Probab. Theory Relat. Fields 61, 405–415 (1982)
Ould-Saïd, E., Lemdani, M.: Asymptotic properties of a nonparametric regression function estimator with randomly truncated data. Ann. Inst. Stat. Math. 58, 357–378 (2006)
Stute, W., Wang, J.L.: The central limit theorem under random truncation. Bernoulli 14, 604–622 (2008)
Woodroofe, W.: Estimation a distribution function with truncated data. Ann. Stat. 13, 163–177 (1985)
Xu, H.X., Chen, Z.L., Wang, J.F., Fan, G.L.: Quantile regression and variable selection for partially linear model with randomly truncated data. Stat. Papers (2017). https://doi.org/10.1007/s00362-016-0867-3
Yu, K., Jones, M.C.: Local linear quantile regression. J. Am. Stat. Assoc. 93, 228–237 (1998)
Zhou, W.H.: A weighted quantile regression for randomly truncated data. Comput. Stat. Data Anal. 55, 554–566 (2011)
Zou, H., Yuan, M.: Composite quantile regression and the oracle model selection theory. Ann. Stat. 36, 1108–1126 (2008)
Acknowledgements
The authors thank the editor, an associated editor and reviewers for their constructive comments, which have led to a dramatic improvement of the earlier version of this article. This research was supported by the National Natural Science Foundation of China (11371321, 11401006), Chinese Postdoctoral Science Foundation (2017M611083), the Project of Humanities and Social Science Foundation of Ministry of Education (15YJC910006), the National Statistical Science Research Program of China (2017LY51, 2016LY80, 2016LZ05), Zhejiang Provincial Natural Science Foundation (LY18A010007), Zhejiang Provincial Key Research Base for Humanities and Social Science Research (Statistics 1020XJ3316004G) and First Class Discipline of Zhejiang - A (Zhejiang Gongshang University- Statistics).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Lemma A.1
Let \((X_1,Y_1), \ldots , (X_n,Y_n)\) be independent and identically distributed (i.i.d.) random vectors. Assume that \(E|Y|^r<\infty , sup_x\int |y|^rf(x,y){\hbox {d}}y<\infty \), where f denotes the joint density of (X, Y). Let K be a bounded positive function with a bounded support, satisfying a Lipschitz condition. Then
provided that \(n^{2\varepsilon -1}h\rightarrow \infty \) for some \(\varepsilon <1-r^{-1}\).
Lemma A.1 follows from the result by Mack and Silverman (1982).
Lemma A.2
(Lv et al. 2015) Suppose \(A_n(s)\) is convex and can be represented as \(\frac{1}{2}s^\mathrm{T}Vs+U_n^\mathrm{T}s+C_n+r_n(s)\), where V is symmetric and positive definite, \(U_n\) is stochastically bounded, \(C_n\) is arbitrary and \(r_n(s)\) goes to zero in probability for each s. Then \(\alpha _n\), the argmin of \(A_n\), is only \(o_p(1)\) away from \(\beta _n=-V^{-1}U_n\), the argmin of \(\frac{1}{2}s^\mathrm{T}Vs+U_n^\mathrm{T}s+C_n\). If also \(U_n {\mathop {\rightarrow }\limits ^{\mathcal {D}}}U\), then \(\alpha _n{\mathop {\rightarrow }\limits ^{\mathcal {D}}}-V^{-1}U\).
Let \(\eta ^{*}_{i,k}=I(\varepsilon _i-c_{\tau _{k}}+r_i(u)\le 0)-\tau _k, ~\eta _{i,k}=I(\varepsilon _i-c_{\tau _{k}}\le 0)-\tau _k, ~\delta _n=\Big (\frac{\log (1/h)}{nh}\Big )^{1/2}\).
Proof of Theorem 2.1
The proof of Theorem 2.1 follows similar strategies in Theorem 2.2, and we omit the details here.
Proof of Theorem 2.2
Recall that \(\{\hat{a}_{0,1}, \ldots , \hat{a}_{0,q}, \hat{\mathbf{a}}, \hat{b}_0, \hat{\mathbf{b}}\}\) minimizes
Denote
\(\mathbf{e_k}\) is a q-vector with 1 at the kth position and 0 elsewhere. We write \(Y_i-a_{0,k}-b_0(U_i-u)-X_i^\mathrm{T}\{\mathbf{a}+\mathbf{b}(U_i-u)\}=\varepsilon _i-c_{\tau _{k}}+r_i(u)-N^\mathrm{T}_{i,k}\xi /\sqrt{nh}:=\varepsilon _i-c_{\tau _{k}}+r_i(u)-\Delta _{i,k}\), where \( r_i(u)=\alpha _0(U_i)-\alpha _0(u)-\alpha '_0(u)(U_i-u)+X_i^\mathrm{T}\{\alpha (U_i)-\alpha (u)-\alpha '(u)(U_i-u)\}, K_i(u)=K(\frac{U_i-u}{h})\), then \(\hat{\xi }\) will be the minimizer of
Following the identity by Knight (1998),
where \(\psi _\tau (u)=\tau -I(u\le 0)\). Then we obtain
where \(W_{n,k}(u)=\frac{1}{\sqrt{nh}}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G_n(Y_i)}\eta ^{*}_{i,k}N_{i,k}, \eta ^{*}_{i,k}=I(\varepsilon _i-c_{\tau _{k}}+r_i(u)\le 0)-\tau _k\).
Firstly, we prove that \(E\{\sum _{k=1}^qB_{n,k}(\xi )\}=\frac{1}{2}\xi ^\mathrm{T}\frac{f_U(u)}{\theta }S(u)\xi \). Let \(\widetilde{B}_{n,k}(\xi )=\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\int _{0}^{\Delta _{i,k}}\big \{I(\varepsilon _i\le c_{\tau _{k}}-r_i(u)+z)-I(\varepsilon _i\le c_{\tau _{k}}-r_i(u))\big \}{\hbox {d}}z\), \(\Delta (u,x,\mu )\) and \( r_i(u,x,\mu )\) be equal to \(N_{i,k}^\mathrm{T}\xi /\sqrt{nh}\) and \( r_i(u)\), where \(X_i, U_i\) are replaced by \(x,\mu \).
Since \(\widetilde{B}_{n,k}(\xi )\) is a summation of i.i.d. random variables of the kernel form, according to Lemma A.1, we have \(\widetilde{B}_{n,k}(\xi )=E[\widetilde{B}_{n,k}(\xi )]+O_p(\delta _n)\). The expectation of \(\widetilde{B}_{n,k}(\xi )\) is
Further, we can prove that \(\mathbb {E}\{S_n(u)\}=f_U(u)S(u)+O(h^2)\), where \(S(u)=\text{ diag }\{S_1(u),c\mu _2S_2(u)\}\), \(S_2(u)=\mathbb {E}\{(1,X^\mathrm{T})^\mathrm{T}(1,X^\mathrm{T})|U=u\}\), \(c_{\tau _{k}}=F_{\varepsilon }^{-1}(\tau _k)\),
C is a \(q\times q\) diagonal matrix with \(C_{jj}=f_{\varepsilon }(c_{\tau _{j}})\), \(\mathbf{c}=(f_{\varepsilon }(c_{\tau _{1}}), \ldots , f_{\varepsilon }(c_{\tau _{q}}))^\mathrm{T}\), \(c=\sum ^q_{k=1}f_{\varepsilon }(c_{\tau _{k}})\). Similarly, we can obtain \(\text{ Var }\{{\widetilde{B}_{n,k}(\xi )}\}=o(1)\). Then \(\widetilde{B}_{n,k}(\xi )=\frac{1}{2}\xi ^\mathrm{T}\frac{f_U(u)}{\theta }S(u)\xi +O_p(\delta _n)\). According to Lemma 5.2 in Liang and Baek (2016), we have
By some calculations, we can obtain
Thus,
According to Lemma A.2, the minimizer of \(Q_{n}(\xi )\) can be expressed as
Therefore,
where \(W^*_{n,k}(u)=\frac{1}{\sqrt{nh}}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G_n(Y_i)}\eta ^{*}_{i,k}(e_k^\mathrm{T},X_i^\mathrm{T})^\mathrm{T}\).
Denote \(\widetilde{W}^*_{n,k}(u)=\frac{1}{\sqrt{nh}}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta _{i,k}(e_k^\mathrm{T},X_i^\mathrm{T})^\mathrm{T}:=(w_{11}, \ldots , w_{1q},w_{21})^\mathrm{T}\), where \(w_{1k}=(nh)^{-1/2}\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta _{i,k}, k=1,\ldots ,q\), and \(w_{21}=(nh)^{-1/2}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta _{i,k}X_i\). Note that \(\text{ Cov }(\eta _{i,k}, \eta _{i,k^{'}})=\tau _{kk^{'}}=\tau _k\wedge \tau _{k^{'}}-\tau _k\tau _{k^{'}}\) and \(\text{ Cov }(\eta _{i,k}, \eta _{j,k^{'}})=0\) if \(i\ne j\). Then
Similarly, we can obtain \(E(w_{21})=0\). On the other hand,
Similarly, we can obtain that \(\text{ Cov }(w_{1k}, w_{21})=\frac{\nu _0 f_U(u)}{\theta }\sum _{k^{'}=1}^q\lambda ^1_{kk^{'}}(u):=\frac{\nu _0 f_U(u)}{\theta }A_{12}(u)\) and \(\text{ Var }(w_{21})=\frac{\nu _0 f_U(u)}{\theta }\sum _{k=1}^q\sum _{k^{'}=1}^q\lambda ^2_{kk^{'}}(u):=\frac{\nu _0 f_U(u)}{\theta }A_{22}(u)\).
By the Cramér–Wald theorem and the central limit theorem, we have
Define \(\overline{W}^*_{n,k}(u)=\frac{1}{\sqrt{nh}}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta ^*_{i,k}(e_k^\mathrm{T},X_i^\mathrm{T})^\mathrm{T}:=(\bar{w}_{11}, \ldots , \bar{w}_{1q},\bar{w}_{21})^\mathrm{T},\) where \(\bar{w}_{1k}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta ^*_{i,k}, k=1,\ldots ,q,\) and \(\bar{w}_{21}=\frac{1}{\sqrt{nh}}\sum _{k=1}^q\sum _{i=1}^n\frac{K_i(u)}{G(Y_i)}\eta ^*_{i,k}X_i\).
By some calculations, we have
Thus \(\text{ Var }\{\overline{W}^*_{n,k}(u)-\widetilde{W}^*_{n,k}(u)\}=o(1)\). By Slutsky’s theorem, we have
Note that
Similar to the proof of (A.2), we have \(W^*_{n,k}(u)-\overline{W}^*_{n,k}(u)=o_p(1)\). Thus,
Next we calculate the mean of \(\overline{W}^*_{n,k}(u)\). In fact
Combining (A.4)–(A.8), the proof of Theorem 2.2 is completed.
Rights and permissions
About this article
Cite this article
Xu, HX., Fan, GL., Chen, ZL. et al. Weighted quantile regression and testing for varying-coefficient models with randomly truncated data. AStA Adv Stat Anal 102, 565–588 (2018). https://doi.org/10.1007/s10182-018-0319-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-018-0319-6
Keywords
- Varying-coefficient models
- Composite quantile regression
- Randomly truncated data
- Asymptotic normality
- Bootstrap