Abstract
Quantile regression (QR) is becoming increasingly popular due to its relevance in many scientific investigations. There is a great amount of work about linear and nonlinear QR models. Specifically, nonparametric estimation of the conditional quantiles received particular attention, due to its model flexibility. However, nonparametric QR techniques are limited in the number of covariates. Dimension reduction offers a solution to this problem by considering low-dimensional smoothing without specifying any parametric or nonparametric regression relation. The existing dimension reduction techniques focus on the entire conditional distribution. We, on the other hand, turn our attention to dimension reduction techniques for conditional quantiles and introduce a new method for reducing the dimension of the predictor \(\mathbf {X}\). The novelty of this paper is threefold. We start by considering a single index quantile regression model, which assumes that the conditional quantile depends on \(\mathbf {X}\) through a single linear combination of the predictors, then extend to a multi-index quantile regression model, and finally, generalize the proposed methodology to any statistical functional of the conditional distribution. The performance of the methodology is demonstrated through simulation examples and real data applications. Our results suggest that this method has a good finite sample performance and often outperforms the existing methods.
Similar content being viewed by others
References
Alkenani, A., Yu, K.: Penalized single-index quantile regression. Int. J. Stat. Probab. 2(3), 12–30 (2013)
Breiman, L., Friedman, J.H.: Estimating optimal transformations for multiple regression and correlation. J. Am. Stat. Assoc. 80(391), 580–598 (1985)
Brillinger, D.R.: A generalized linear model with ‘Gaussian’ regressor variables. In: Bickel, P.J., Doksum, K.A., Hodges, J.L. (eds.) A Festschrift for Erich L Lehmann. CRC Press, Wadsworth, Belmont, CA (1983)
Bura, E., Cook, R.D.: Extending sliced inverse regression: the weighted chi-squared test. J. Am. Stat. Assoc. 96(455), 996–1003 (2001)
Chaudhuri, P.: Nonparametric estimates of regression quantiles and their local Bahadur representation. Ann. Stat. 19(2), 760–777 (1991)
Chaudhuri, P., Doksum, K., Samarov, A.: On average derivative quantile regression. Ann. Stat. 25, 715–744 (1997)
Chiaromonte, F., Cook, R.D., Li, B.: Sufficient dimension reduction in regressions with categorical predictors. Ann. Stat. 30, 475–497 (2002)
Christou, E.: Robust dimension reduction using sliced inverse median regression. Stat. Pap. (2018). https://doi.org/10.1007/s00362-018-1007-z
Christou, E., Akritas, M.G.: Single index quantile regression for heteroscedastic data. J. Multivar. Anal. 150, 169–182 (2016)
Christou, E., Akritas, M.G.: Variable selection in heteroscedastic single index quantile regression. Commun. Stat. Theory Methods 47, 6019–6033 (2018)
Christou, E., Grabchak, M.: Estimation of value-at-risk using single index quantile regression. J. Appl. Stat. 46(13), 2418–2433 (2019)
Cont, R.: Empirical properties of asset returns: stylized facts and statistical issues. Quant. Finance 1(2), 223–236 (2001)
Cook, R.D.: Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York (1998)
Cook, R.D., Li, B.: Dimension reduction for conditional mean in regression. Ann. Stat. 30(2), 455–474 (2002)
Cook, R.D., Nachtsheim, C.J.: Reweighting to achieve elliptically contoured covariates in regression. J. Am. Stat. Assoc. 89(426), 592–599 (1994)
Cook, R.D., Weisberg, S., Li, K.-C.: Comment on “Sliced inverse regression for dimension reduction”. J. Am. Stat. Assoc. 86, 328–332 (1991)
Diaconis, P., Freedman, D.: Asymptotics of graphical projection pursuit. Ann. Stat. 12, 793–815 (1984)
Dong, Y., Li, B.: Dimension reduction for non-elliptically distributed predictors: second-order methods. Biometrika 97, 279–294 (2010)
Fan, Y., Härdle, W.K., Wang, W., Zhu, L.: Single-index-based CoVaR with very high-dimensional covariates. J. Bus. Econ. Stat. 36(2), 212–226 (2018)
Gooijer, J.G., Zerom, D.: On additive conditional quantiles with high-dimensional covariates. J. Am. Stat. Assoc. 98(461), 135–146 (2003)
Grocer, S.: Beware the risks of the bitcoin: winklevii outline the downside. Wall Street J. (2013). https://blogs.wsj.com/moneybeat/2013/07/02/beware-the-risks-of-the-bitcoin-winklevii-outline-the-downside/
Guerre, E., Sabbah, C.: Uniform bias study and Bahadur representation for local polynomial estimators of the conditional quantile function. Econom. Theory 28(01), 87–129 (2012)
Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)
Hristache, M., Juditsky, A., Polzehl, J., Spokoiny, V.: Structure adaptive approach for dimension reduction. Ann. Stat. 29(6), 1537–1566 (2001)
Jiang, R., Zhou, Z.-G., Qian, W.-M., Chen, Y.: Two step composite quantile regression for single-index models. Comput. Stat. Data Anal. 64, 180–191 (2013)
Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46, 33–50 (1978)
Kong, E., Xia, Y.: A single-index quantile regression model and its estimation. Econom. Theory 28, 730–768 (2012)
Kong, E., Xia, Y.: An adaptive composite quantile approach to dimension reduction. Ann. Stat. 42(4), 1657–1688 (2014)
Kong, E., Linton, O., Xia, Y.: Uniform Bahadur representation for local polynomial estimates of M-regression and its application to the additive model. Econom. Theory 26, 1529–1564 (2010)
Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)
Li, K.-C.: On Principal Hessian directions for data visualization and dimension reduction: another application of Stein’s Lemma. J. Am. Stat. Assoc. 87(420), 1025–1039 (1992)
Li, B., Dong, Y.: Dimension reduction for nonelliptically distributed predictors. Ann. Stat. 37, 1272–1298 (2009)
Li, K.-C., Duan, N.: Regression analysis under link violation. Ann. Stat. 17(3), 1009–1052 (1989)
Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102(479), 997–1008 (2007)
Li, B., Zha, H., Chiaromonte, F.: Contour regression: a general approach to dimension reduction. Ann. Stat. 33(4), 1580–1616 (2005)
Luo, W., Li, B., Yin, X.: On efficient dimension reduction with respect to a statistical functional of interest. Ann. Stat. 42(1), 382–412 (2014)
Ma, Y., Zhu, L.: A semiparametric approach to dimension reduction. J. Am. Stat. Assoc. 107(497), 168–179 (2012)
Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). Available online. https://bitcoin.org/bitcoin.pdf
Pollard, D.: Asymptotics for least absolute deviation regression estimators. Econom. Theory 7(2), 186–199 (1991)
Shin, S.J., Artemiou, A.: Penalized principal logistic regression for sparse sufficient dimension reduction. Comput. Stat. Data Anal. 111, 48–58 (2017)
Wang, H., Xia, Y.: Sliced regression for dimension reduction. J. Am. Stat. Assoc. 103, 811–821 (2008)
Wu, T.Z., Yu, K., Yu, Y.: Single index quantile regression. J. Multivar. Anal. 101(7), 1607–1621 (2010)
Xia, Y., Tong, H., Li, W.K., Zhu, L.-X.: An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 64, 363–410 (2002)
Ye, Z., Weiss, R.E.: Using the bootstrap to select one of a new class of dimension reduction methods. J. Am. Stat. Assoc. 98(464), 968–979 (2003)
Yin, X., Cook, R.D.: Dimension reduction for the conditional \(k\)th moment in regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 62, 159–175 (2002)
Yin, X., Li, B.: Sufficient dimension reduction based on an ensemble of minimum average variance estimators. Ann. Stat. 39, 3392–3416 (2011)
Yu, K., Jones, M.C.: Local linear quantile regression. J. Am. Stat. Assoc. 93(441), 228–238 (1998)
Yu, K., Lu, Z.: Local linear additive quantile regression. Scand. J. Stat. 31, 333–346 (2004)
Zhang, L.-M., Zhu, L.-P., Zhu, L.-X.: Sufficient dimension reduction in regressions through cumulative Hessian directions. Stat. Comput. 21(3), 325–334 (2011)
Zhu, L.-P., Zhu, L.-X.: Dimension reduction for conditional variance in regressions. Stat. Sin. 19, 869–883 (2009)
Zhu, L.-P., Zhu, L.-X., Feng, Z.-H.: Dimension reduction in regression through cumulative slicing estimation. J. Am. Stat. Assoc. 105(492), 1455–1466 (2010)
Zhu, X., Guo, X., Zhu, L.: An adaptive-to-model test for partially parametric single-index models. Stat. Comput. 27(5), 1193–1204 (2017)
Acknowledgements
We would like to thank Professors Michael Akritas and Bing Li from the Pennsylvania State University for useful discussions regarding the presented paper. We would also like to thank Mr. Mark Hamrick for help in running some of the simulations, and the two anonymous referees, whose comments lead to improvements in the presentation of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: notation and assumptions
Notation We say that a function \(m(\cdot ): \mathbb {R}^{p} \rightarrow \mathbb {R}\) has the order of smoothness s on the support \(\mathcal {X}_{0}\), denoted by \(m(\cdot ) \in H_{s}(\mathcal {X}_{0})\), if (a) it is differentiable up to order [s], where [s] denotes the lowest integer part of s, and (b) there exists a constant \(L>0\), such that for all \(\mathbf {u}=(u_{1}, \ldots , u_{p})^\top \) with \(|\mathbf {u}|=u_{1}+ \cdots +u_{p}=[s]\), all \(\tau \) in an interval \([\underline{\tau }, \overline{\tau }]\), where \(0<\underline{\tau } \le \bar{\tau } <1\), and all \(\mathbf {x}\), \(\mathbf {x}'\) in \(\mathcal {X}_{0}\),
where \(D^{\mathbf {u}}m(\mathbf {x})\) denotes the partial derivative
and \(\left\| \cdot \right\| \) denotes the Euclidean norm.
Assumptions
-
A1
The following moment conditions are satisfied
$$\begin{aligned} E \left\| \mathbf {X}\mathbf {X}^{\top } \right\|< \infty , \ \ E |Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})|^2< & {} \infty ,\\ E\left\{ Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})^2 \left\| \mathbf {X}\mathbf {X}^{\top } \right\| \right\}< & {} \infty , \end{aligned}$$for a given \(\tau \in (0,1)\).
-
A2
The distribution of \(\mathbf {A}^{\top }\mathbf {X}\) has a probability density function \(f_{\mathbf {A}}(\cdot )\) with respect to the Lebesgue measure, which is strictly positive and continuously differentiable over the support \(\mathcal {X}_{0}\) of \(\mathbf {X}\).
-
A3
The cumulative distribution function \(F_{Y| \mathbf {A}}(\cdot |\cdot )\) of Y given \(\mathbf {A}^{\top }\mathbf {X}\) has a continuous probability density function \(f_{Y|\mathbf {A}}(y|\)\( \mathbf {A}^{\top }\mathbf {x})\) with respect to the Lebesgue measure, which is strictly positive for y in \(\mathbb {R}\) and \(\mathbf {A}^{\top }\mathbf {x}\), for \(\mathbf {x}\) in \(\mathcal {X}_{0}\). The partial derivative \(\partial F_{Y| \mathbf {A}}(y| \mathbf {A}^{\top }\mathbf {x})/ \partial \mathbf {A}^{\top }\mathbf {x}\) is continuous. There is a \(L_{0}>0\), such that
$$\begin{aligned}&|f_{Y|\mathbf {A}}(y|\mathbf {A}^{\top }\mathbf {x})-f_{Y|\mathbf {A}}(y'|\mathbf {A}^{\top }\mathbf {x}')|\\&\quad \le L_{0} \left\| (\mathbf {A}^{\top }\mathbf {x},y)-(\mathbf {A}^{\top }\mathbf {x}',y') \right\| \ \end{aligned}$$for all \((\mathbf {x},y), (\mathbf {x}',y') \ \text {of} \ \mathcal {X}_{0} \times \mathbb {R}\).
-
A4
The nonnegative kernel function \(K(\cdot )\), used in (5), is Lipschitz over \(\mathbb {R}^{d}\), \(d \ge 1\), and satisfies \(\int K(\mathbf {z})d \mathbf {z}=1\). For some \(\underline{K}>0\), \(K(\mathbf {z}) \ge \underline{K} I\{\mathbf {z} \in B(0,1)\}\), where B(0, 1) is the closed unit ball. The associated bandwidth h, used in the estimation procedure, is in \([\underline{h},\overline{h}]\) with \(0< \underline{h} \le \overline{h} < \infty \), \(\lim _{n \rightarrow \infty } \overline{h}=0\) and \(\lim _{n \rightarrow \infty } (\ln {n})/(n \underline{h}^{d})=0\).
-
A5
\(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\) is in \(H_{s_{\tau }}(\mathcal {T}_{\mathbf {A}})\) for some \(s_{\tau }\) with \([s_{\tau }] \le 1\), where \(\mathcal {T}_{\mathbf {A}}=\{\mathbf {z} \in \mathbb {R}^{d}: \mathbf {z}=\mathbf {A}^{\top }\mathbf {x}, \mathbf {x} \in \mathcal {X}_{0}\}\), and \(\mathcal {X}_{0}\) is the support of \(\mathbf {X}\).
Appendix B: Proof of main results
1.1 Appendix B.1: Some lemmas
Lemma 1
Under Assumptions A2–A5 given in “Appendix A”, and the assumption that \(\widehat{\mathbf {A}}\) is \(\sqrt{n}\)-consistent estimate of the directions of the CS, then
where \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) denotes the local linear conditional quantile estimate of \(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\), given in (5).
Proof
Observe that
The first term follows from the Bahadur representation of \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})-\widehat{Q}_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\) (see Guerre and Sabbah 2012) and the \(\sqrt{n}\)-consistency of \(\widehat{\mathbf {A}}\). The second term follows from Corollary 1 (ii) of Guerre and Sabbah (2012). \(\square \)
Note For the study of the asymptotic properties of \(\widehat{\varvec{\beta }}_{\tau }\), defined in (4), we consider an equivalent objective function. Observe that minimizing \(\sum _{i=1}^{n}\{\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})-a_{\tau }-\mathbf {b}_{\tau }^{\top }\mathbf {X}_{i}\}^2\) with respect to \((a_{\tau },\mathbf {b}_{\tau })\), is equivalent with minimizing
with respect to \((a_{\tau },\mathbf {b}_{\tau })\). By expanding the square, (10) can be written as
Lemma 2
Let \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }))\) be as defined in (11), where \(\varvec{\gamma }_{\tau }=\sqrt{n} \{(a_{\tau },\mathbf {b}_{\tau })-(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })\}\) and \((\alpha ^*_{\tau }, \varvec{\beta }_{\tau }^*)\) is defined in (3). Then, under the assumptions of Lemma 1 and additionally Assumption A1 of “Appendix A”, we have the following quadratic approximation, uniformly in \(\varvec{\gamma }_{\tau }\) in a compact set,
where \(\mathbb {V}=E\{(1,\mathbf {X})(1,\mathbf {X}^{\top })^{\top }\}\),
and
Proof
Observe that
where \(\mathbb {V}_{n}=n^{-1}\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }\), and \(\mathbf {W}_{\tau ,n}\) and \(C_{\tau ,n}\) are defined in (12) and (13), respectively. It is easy to see that \(\mathbb {V}_{n}=\mathbb {V}+o_{p}(1)\), and therefore,
Provided that \(\mathbf {W}_{\tau ,n}\) is stochastically bounded, it follows from the convexity lemma (Pollard 1991) that the quadratic approximation to the convex function \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }))\) holds uniformly for \(\varvec{\gamma }_{\tau }\) in a compact set. Remains to prove that \(\mathbf {W}_{\tau ,n}\) is stochastically bounded.
Since \(\mathbf {W}_{\tau ,n}\) involves the quantity \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})\), which is data dependent and not deterministic function, we define
where \(\phi _{\tau }: \mathbb {R}^{d+1} \rightarrow \mathbb {R}\) is a function in the class \(\Phi _{\tau }\), whose value at \((y,\mathbf {A}^\top \mathbf {x}) \in \mathbb {R}^{d+1}\) can be written as \(\phi _{\tau }(y|\mathbf {A}^{\top }\mathbf {x})\), in the non-separable space \(l^{\infty }(y,\mathbf {A}^{\top }\mathbf {x})=\{(y,\mathbf {A}^{\top }\mathbf {x}): \mathbb {R}^{d+1} \rightarrow \mathbb {R}: \left\| \phi _{\tau }\right\| _{(y,\mathbf {A}^{\top }\mathbf {x})}:= \sup _{(y,\mathbf {A}^{\top }\mathbf {x}) \in \mathbb {R}^{d+1}} |\phi _{\tau }(y|\mathbf {A}^{\top }\mathbf {x})|<\infty \}\), and satisfying \(E|\phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})|^2 < \infty \) and
Since \(\Phi _{\tau }\) includes \(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\), and according to Lemma 1, includes \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) for n large enough, almost surely, we will prove that \(\mathbf {W}_{\tau ,n}(\phi _{\tau })\) is stochastically bounded, uniformly on \(\phi _{\tau } \in \Phi _{\tau }\).
Observe that
which follows from the properties of the class \(\Phi _{\tau }\) defined above. Bounded second moment implies that \(\mathbf {W}_{\tau , n}(\phi _{\tau })\) is stochastically bounded. Since
- 1.
The result was proven uniformly on \(\phi _{\tau }\), and
- 2.
The class \(\Phi _{\tau }\) includes \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) for n large enough, almost surely,
the proof follows. \(\square \)
1.2 Appendix B.2: Proof of theorem 4
To prove the \(\sqrt{n}\)-consistency of \(\widehat{\varvec{\beta }}_{\tau }\), enough to show that for any given \(\delta _{\tau }>0\), there exists a constant \(C_{\tau }\) such that
where \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau },\varvec{\beta }_{\tau }))\) defined in (10) and implies that with probability at least \(1-\delta _{\tau }\) there exists a local minimum in the ball \(\{\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }): \left\| \varvec{\gamma }_{\tau } \right\| \le C_{\tau }\}\). This in turn implies that there exists a local minimizer such that \(\left\| (\widehat{\alpha }_{\tau },\widehat{\varvec{\beta }}_{\tau })-(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }) \right\| =O_{p}\left( n^{-1/2} \right) \). The quadratic approximation derived in Lemma 2 yields that
for any \(\varvec{\gamma }_{\tau }\) in a compact subset of \(\mathbb {R}^{p+1}\). Therefore, the difference (15) is dominated by the quadratic term \((1/2)\varvec{\gamma }_{\tau }^\top \mathbb {V}\varvec{\gamma }_{\tau }\) for \(\left\| \varvec{\gamma }_{\tau }\right\| \) greater than or equal to sufficiently large \(C_{\tau }\). Hence, (14) follows. \(\square \)
1.3 Appendix B.3: Proof of theorem 8
Let \(\widehat{\mathbf {V}}_{\tau }=(\widehat{\varvec{\beta }}_{\tau ,0}, \dots , \widehat{\varvec{\beta }}_{\tau ,p-1})\) be a \(p \times p\) matrix, where \(\widehat{\varvec{\beta }}_{\tau ,0}=\widehat{\varvec{\beta }}_{\tau }\), defined in (4), and \(\widehat{\varvec{\beta }}_{\tau ,j}=E_{n}\{\widehat{Q}_{\tau }(Y|\widehat{\varvec{\beta }}_{\tau ,j-1}^{\top }\mathbf {X})\mathbf {X}\}\) for \(j=1,\dots ,p-1\). Moreover, let \(\mathbf {V}_{\tau }\) be the population level of \(\widehat{\mathbf {V}}_{\tau }\). It is easy to see that \(\widehat{\mathbf {V}}_{\tau }\) converges to \(\mathbf {V}_{\tau }\) at \(\sqrt{n}\)-rate. This follows from the central limit theorem and Lemma 1. Then, for \(\left\| \cdot \right\| \) the Frobenius norm,
and the eigenvectors of \(\widehat{\mathbf {V}}_{\tau } \widehat{\mathbf {V}}_{\tau }^{\top }\) converge to the corresponding eigenvectors of \(\mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top }\). Finally, the subspace spanned by the \(d_{\tau }\) eigenvectors of \(\mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top }\) falls into \(\mathcal {S}_{Q_{\tau }(Y|\mathbf {X})}\) and the proof is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Christou, E. Central quantile subspace. Stat Comput 30, 677–695 (2020). https://doi.org/10.1007/s11222-019-09915-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-019-09915-8