Abstract
This paper is concerned with feature screening for the ultrahigh dimensional discriminant analysis. A new feature screening procedure based on the conditional quantile is proposed. The proposed procedure has some desirable features. First, it is model-free which does not require specific discriminant model and can be directly applied to the multi-categories situation. Second, it is robust against heavy-tailed distributions, potential outliers and the sample shortage for some categories, which are very common for high dimensional data. We establish the sure screening property and ranking consistency property of the proposed procedure under some regular conditions. Simulation studies and a real data example are used to assess its finite sample performance.
Similar content being viewed by others
References
Armstrong SA, Staunton JE, Silverman LB, Pieters R, Den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Chen X, Chen X, Liu Y (2017) A note on quantile feature screening via distance correlation. Stat Papers 60:1741–1762
Cheng G, Li X, Lai P, Song F, Yu J (2017) Robust rank screening for ultrahigh dimensional discriminant analysis. Stat Comput 27:535–545
Cui H, Li R, Zhong W (2015) Model-free feature screening for ultrahigh dimensional discriminant analysis. J Am Stat Assoc 110:630–641
Fan J, Fan Y (2008) High dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637
Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B 70:849–911
Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604
Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high dimensional additive models. J Am Stat Assoc 106:544–557
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58:13–30
Lai P, Song F, Chen K, Liu Z (2017) Model free feature screening with dependent variable in ultrahigh dimensional binary classification. Stat Probab Lett 125:141–148
Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1139
Liu J, Li R, Wu R (2014) Feature selection for varying coefficient models with ultrahigh-dimensional covariates. J Am Stat Assoc 109:266–274
Lo SH, Singh K (1986) The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields 71:455–465
Mai Q, Zou H (2013) The Kolmogorov filter for variable screening in high-dimensional binary classification. Biometrika 100:229–234
Pan R, Wang H, Li R (2016) Ultrahigh dimensional multi-class linear discriminant analysis by pairwise sure independence screening. J Am Stat Assoc 111:169–179
Song F, Lai P, Shen B, Cheng G (2018) Variance ratio screening for ultrahigh dimensional discriminant analysis. Commun Stat Theory Methods 47:6034–6051
Tibshirani R (1996) Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodol) 58:267–288
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572
Wu Y, Yin G (2015) Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102:65–76
Zhu L, Li L, Li R, Zhu L (2011) Model-free feature screening for ultrahigh-dimensional Data. J Am Stat Assoc 106:1464–1475
Acknowledgements
Peng Lai’s research was supported by National Natural Science Foundation of China (Grant No. 11771215), Natural Science Foundation of Jiangsu Province (Grant No. BK20161530).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
To prove the two theorems, we present the following lemma.
Lemma 1
[Hoeffding’s Inequality; Hoeffding (1963)] Let \(X_1,\ldots ,X_n\) be independent random variables. Assume that \(P(X_i\in [a_i,b_i])=1\) for \(1\le {i}\le {n}\), where \(a_i\) and \(b_i\) are constants. Let \(\overline{X}=\frac{1}{n}\sum _{i=1}^n{X_i}\). Then the following inequality holds:
where t is a positive constant and \(E(\overline{X})\) is the expected value of \(\overline{X}\).
Proof of Theorem 1
According the definitions of \(\omega _{j}\) and \(\hat{\omega }_{j}\), we have
where \(\tilde{\omega }_{j}=\frac{1}{M}\sum _{k=1}^{M}\sum _{r=1}^{R_{n}}p_{r}\Big (Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\Big )^{2}\). According to the property of integral, \(|\tilde{\omega }_{j}-\omega _{j}|=O(M^{-2})\), when \(M> \sqrt{2/\epsilon }\), we can get \(|\tilde{\omega }_{j}-\omega _{j}|\le \frac{ \varepsilon }{2}\). Consequently,
In fact,
By Condition (C1) and the Lemma 3 of Lo and Singh (1986), we have \(\sup _{\tau _{k}\in (0,1)}\big |\widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j}|Y=y_{r}) \big |=O(n^{-1/2}(\log (n))^{1/2})\), \(\sup _{\tau _{k}\in (0,1)}\big |\widehat{Q}_{\tau _{k}}(X_{j})-Q_{\tau _{k}}(X_{j}) \big |=O(n^{-1/2}(\log (n))^{1/2})\). Taking n large enough and \(0<\alpha <\frac{1}{2}\), i.e., \(\frac{\log n}{n^{1-2\alpha }}\le c_1\varepsilon ^2\), \(c_1\) is some positive constant, which deduces \(\sum _{r=1}^{R_{n}}\hat{p}_{r}\Big [\Big (\widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\Big )^{2}- \Big (Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\Big )^{2}\Big ]\le \frac{\varepsilon }{4}\), we have that
Now, we define \(Z_{i,r}=I\{Y_{i}=y_{r}\}-p_{r}\). Then, for any fixed r, \(Z_{i,r}\) is independent for i with \(E(Z_{i,r})=0\) and \(|Z_{i,r}|\le 1\). Thus, noting that \(\hat{p}_{r}-p_{r}=\frac{1}{n}\sum _{i=1}^n Z_{i,r}\), by Hoeffding’s Inequality,
Then, we can get
Take \(M=O(n^{\beta })\), \(R_{n}=O(n^{\alpha })\), for \(0\le \kappa <{\frac{1}{2}-\alpha }\), \(0<\alpha <\frac{1}{2}\), we have
Next, we deal with the second part of Theorem 1. If \(\mathcal {A}\nsubseteq \mathcal {\hat{A}}\), then there must exist some \(j\in \mathcal {A}\) such that \(\hat{\omega }_j<cn^{-\kappa }\). It follows from Condition (C2) that \(|\hat{\omega }_j-\omega _j|>{cn^{-\kappa }}\), for some \(j\in \mathcal {A}\). This indicates that the event satisfies \(\{\mathcal {A}\nsubseteq \mathcal {\hat{A}}\}\subseteq \{|\hat{\omega }_j-\omega _j|>{cn^{-\kappa }}, \text{ for } \text{ some } \quad j\in \mathcal {A}\)}, Hence, \(\{\max \limits _{j\in \mathcal {A}}|\hat{\omega }_j-\omega _j|\le {cn^{-\kappa }}\}\subseteq \{\mathcal {A}\subseteq \mathcal {\hat{A}}\}\). Consequently, for \(0\le \alpha <{\frac{1}{2}-\kappa }\), \(0<\alpha <\frac{1}{2}\),
where \(s_n\) is the cardinality of \(\mathcal {A}\). This completes the proof of the Theorem 1. \(\square \)
Proof of Theorem 2
If \(\delta =\min \limits _{j\in \mathcal {A}}\omega _j-\max \limits _{j\in \mathcal {I}}\omega _{j}>0\), we can get
\(\square \)
Rights and permissions
About this article
Cite this article
Song, F., Lai, P. & Shen, B. Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis. Metrika 83, 799–820 (2020). https://doi.org/10.1007/s00184-019-00758-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-019-00758-x