Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis

Song, Fengli; Lai, Peng; Shen, Baohua

doi:10.1007/s00184-019-00758-x

Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis

Published: 06 January 2020

Volume 83, pages 799–820, (2020)
Cite this article

Metrika Aims and scope Submit manuscript

Fengli Song¹,
Peng Lai¹ &
Baohua Shen¹

353 Accesses
2 Citations
Explore all metrics

Abstract

This paper is concerned with feature screening for the ultrahigh dimensional discriminant analysis. A new feature screening procedure based on the conditional quantile is proposed. The proposed procedure has some desirable features. First, it is model-free which does not require specific discriminant model and can be directly applied to the multi-categories situation. Second, it is robust against heavy-tailed distributions, potential outliers and the sample shortage for some categories, which are very common for high dimensional data. We establish the sure screening property and ranking consistency property of the proposed procedure under some regular conditions. Simulation studies and a real data example are used to assess its finite sample performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unified mean-variance feature screening for ultrahigh-dimensional regression

Article 17 January 2022

Robust rank screening for ultrahigh dimensional discriminant analysis

Article 12 February 2016

A note on quantile feature screening via distance correlation

Article 09 March 2017

References

Armstrong SA, Staunton JE, Silverman LB, Pieters R, Den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Article Google Scholar
Chen X, Chen X, Liu Y (2017) A note on quantile feature screening via distance correlation. Stat Papers 60:1741–1762
Article MathSciNet Google Scholar
Cheng G, Li X, Lai P, Song F, Yu J (2017) Robust rank screening for ultrahigh dimensional discriminant analysis. Stat Comput 27:535–545
Article MathSciNet Google Scholar
Cui H, Li R, Zhong W (2015) Model-free feature screening for ultrahigh dimensional discriminant analysis. J Am Stat Assoc 110:630–641
Article MathSciNet Google Scholar
Fan J, Fan Y (2008) High dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637
Article MathSciNet Google Scholar
Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B 70:849–911
Article MathSciNet Google Scholar
Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604
Article MathSciNet Google Scholar
Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high dimensional additive models. J Am Stat Assoc 106:544–557
Article MathSciNet Google Scholar
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58:13–30
Article MathSciNet Google Scholar
Lai P, Song F, Chen K, Liu Z (2017) Model free feature screening with dependent variable in ultrahigh dimensional binary classification. Stat Probab Lett 125:141–148
Article MathSciNet Google Scholar
Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1139
Article MathSciNet Google Scholar
Liu J, Li R, Wu R (2014) Feature selection for varying coefficient models with ultrahigh-dimensional covariates. J Am Stat Assoc 109:266–274
Article MathSciNet Google Scholar
Lo SH, Singh K (1986) The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields 71:455–465
Article MathSciNet Google Scholar
Mai Q, Zou H (2013) The Kolmogorov filter for variable screening in high-dimensional binary classification. Biometrika 100:229–234
Article MathSciNet Google Scholar
Pan R, Wang H, Li R (2016) Ultrahigh dimensional multi-class linear discriminant analysis by pairwise sure independence screening. J Am Stat Assoc 111:169–179
Article Google Scholar
Song F, Lai P, Shen B, Cheng G (2018) Variance ratio screening for ultrahigh dimensional discriminant analysis. Commun Stat Theory Methods 47:6034–6051
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodol) 58:267–288
Article Google Scholar
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572
Article Google Scholar
Wu Y, Yin G (2015) Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102:65–76
Article MathSciNet Google Scholar
Zhu L, Li L, Li R, Zhu L (2011) Model-free feature screening for ultrahigh-dimensional Data. J Am Stat Assoc 106:1464–1475
Article MathSciNet Google Scholar

Download references

Acknowledgements

Peng Lai’s research was supported by National Natural Science Foundation of China (Grant No. 11771215), Natural Science Foundation of Jiangsu Province (Grant No. BK20161530).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Fengli Song, Peng Lai & Baohua Shen

Authors

Fengli Song
View author publications
You can also search for this author in PubMed Google Scholar
Peng Lai
View author publications
You can also search for this author in PubMed Google Scholar
Baohua Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Lai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

To prove the two theorems, we present the following lemma.

Lemma 1

[Hoeffding’s Inequality; Hoeffding (1963)] Let $X_1,\ldots ,X_n$ be independent random variables. Assume that $P(X_i\in [a_i,b_i])=1$ for $1\le {i}\le {n}$, where $a_i$ and $b_i$ are constants. Let $\overline{X}=\frac{1}{n}\sum _{i=1}^n{X_i}$. Then the following inequality holds:

$$\begin{aligned} P\left( \big |\overline{X}-E(\overline{X})\big |\ge {t}\right) \le {2\exp \left\{ -\frac{2n^{2}t^{2}}{\sum _{i=1}^n{(b_i-a_i)^{2}}}\right\} }, \end{aligned}$$

where t is a positive constant and $E(\overline{X})$ is the expected value of $\overline{X}$.

Proof of Theorem 1

According the definitions of $\omega _{j}$ and $\hat{\omega }_{j}$, we have

$$\begin{aligned}&P \left\{ |\hat{\omega }_{j}-\omega _{j}|\ge \varepsilon \right\} \\&\quad =P\left\{ \Big |\frac{1}{M}\sum _{k=1}^{M}\sum _{r=1}^{R_{n}}\hat{p}_{r}\left( \widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\right) ^{2}\right. \\&\qquad -\,\frac{1}{M}\sum _{k=1}^{M}\sum _{r=1}^{R_{n}}p_{r}\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}\\&\qquad \left. +\,\frac{1}{M}\sum _{k=1}^{M}\sum _{r=1}^{R_{n}}p_{r}\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}-\omega _{j}\Big |\ge \varepsilon \right\} \\&\qquad \triangleq P\left\{ \big |\hat{\omega }_{j}-\tilde{\omega }_{j}+\tilde{\omega }_{j}-\omega _{j}\big |\ge \varepsilon \right\} , \end{aligned}$$

where $\tilde{\omega }_{j}=\frac{1}{M}\sum _{k=1}^{M}\sum _{r=1}^{R_{n}}p_{r}\Big (Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\Big )^{2}$. According to the property of integral, $|\tilde{\omega }_{j}-\omega _{j}|=O(M^{-2})$, when $M> \sqrt{2/\epsilon }$, we can get $|\tilde{\omega }_{j}-\omega _{j}|\le \frac{ \varepsilon }{2}$. Consequently,

$$\begin{aligned}&P\left\{ |\hat{\omega }_{j}-\tilde{\omega }_{j}|\ge \varepsilon /2 \right\} \\&\quad \le \sum _{k=1}^{M} P\left\{ \Big |\sum _{r=1}^{R_{n}} \left[ \hat{p}_{r}\left( \widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\right) ^{2}\right. \right. \\&\qquad \left. \left. -\,p_{r}\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}\right] \Big |\ge \varepsilon /2\right\} \\&\quad \le \sum _{k=1}^{M} P\left\{ \Big |\sum _{r=1}^{R_{n}} \left[ \hat{p}_{r}\left( \widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\right) ^{2}\right. \right. \\&\qquad -\,\hat{p}_{r}\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}\\&\qquad \left. \left. +\,(\hat{p}_{r}-p_{r})\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}\right] \Big |\ge \varepsilon /2\right\} . \end{aligned}$$

In fact,

$$\begin{aligned}&\left( \widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\right) ^{2}-\left( Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\right) ^{2}\\&\quad = \left\{ 2\left[ [Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})]\right. \right. \\&\qquad \left. [(\hat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j}|Y=y_{r})) +(Q_{\tau _{k}}(X_{j})-\hat{Q}_{\tau _{k}}(X_{j}))]\right] \\&\qquad \left. +\,\left[ (\hat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j}|Y=y_{r})) +(Q_{\tau _{k}}(X_{j})-\hat{Q}_{\tau _{k}}(X_{j}))\right] ^2\right\} . \end{aligned}$$

By Condition (C1) and the Lemma 3 of Lo and Singh (1986), we have $\sup _{\tau _{k}\in (0,1)}\big |\widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j}|Y=y_{r}) \big |=O(n^{-1/2}(\log (n))^{1/2})$, $\sup _{\tau _{k}\in (0,1)}\big |\widehat{Q}_{\tau _{k}}(X_{j})-Q_{\tau _{k}}(X_{j}) \big |=O(n^{-1/2}(\log (n))^{1/2})$. Taking n large enough and $0<\alpha <\frac{1}{2}$, i.e., $\frac{\log n}{n^{1-2\alpha }}\le c_1\varepsilon ^2$, $c_1$ is some positive constant, which deduces $\sum _{r=1}^{R_{n}}\hat{p}_{r}\Big [\Big (\widehat{Q}_{\tau _{k}}(X_{j}|Y=y_{r})-\widehat{Q}_{\tau _{k}}(X_{j})\Big )^{2}- \Big (Q_{\tau _{k}}(X_{j}|Y=y_{r})-Q_{\tau _{k}}(X_{j})\Big )^{2}\Big ]\le \frac{\varepsilon }{4}$, we have that

$$\begin{aligned} P \left\{ |\hat{\omega }_{j}-\tilde{\omega }_{j}|\ge \varepsilon /4 \right\}\le & {} \sum _{k=1}^{M} P\left\{ \Big |\sum _{r=1}^{R_{n}}(\hat{p}_{r}-p_{r})\Big |\ge \frac{\varepsilon }{4c_{1}}\right\} \\\le & {} \sum _{k=1}^{M} \sum _{r=1}^{R_{n}}P\left\{ \big |\hat{p}_{r}-p_{r}\big |\ge \frac{\varepsilon }{4R_{n}c_{1}}\right\} . \end{aligned}$$

Now, we define $Z_{i,r}=I\{Y_{i}=y_{r}\}-p_{r}$. Then, for any fixed r, $Z_{i,r}$ is independent for i with $E(Z_{i,r})=0$ and $|Z_{i,r}|\le 1$. Thus, noting that $\hat{p}_{r}-p_{r}=\frac{1}{n}\sum _{i=1}^n Z_{i,r}$, by Hoeffding’s Inequality,

$$\begin{aligned} P\left( \big |\hat{p}_{r}-p_{r}\big |>\varepsilon \right) =P\left( \Big |\frac{1}{n}\sum _{k=1}^{n}Z_{i,r}\Big |>\varepsilon \right) \le 2\exp \{-2n\varepsilon ^2\}. \end{aligned}$$

Then, we can get

$$\begin{aligned} P \left\{ |\hat{\omega }_{j}-\tilde{\omega }_{j}|\ge \frac{\varepsilon }{4}\right\} \le 2 M R_{n}\exp \left\{ -\frac{n \varepsilon ^2}{8c_{1}^2R^2_{n}}\right\} . \end{aligned}$$

Take $M=O(n^{\beta })$, $R_{n}=O(n^{\alpha })$, for $0\le \kappa <{\frac{1}{2}-\alpha }$, $0<\alpha <\frac{1}{2}$, we have

$$\begin{aligned} P\left( \max \limits _{1\le {j}\le {p}}|\hat{\omega }_{j}-\omega _{j}|\ge {cn^{-\kappa }}\right) \le {O\left( p(n^{\beta +\alpha })\exp \left\{ -cn^{1-2\alpha -2\kappa }\right\} \right) }. \end{aligned}$$

Next, we deal with the second part of Theorem 1. If $\mathcal {A}\nsubseteq \mathcal {\hat{A}}$, then there must exist some $j\in \mathcal {A}$ such that $\hat{\omega }_j<cn^{-\kappa }$. It follows from Condition (C2) that $|\hat{\omega }_j-\omega _j|>{cn^{-\kappa }}$, for some $j\in \mathcal {A}$. This indicates that the event satisfies $\{\mathcal {A}\nsubseteq \mathcal {\hat{A}}\}\subseteq \{|\hat{\omega }_j-\omega _j|>{cn^{-\kappa }}, \text{ for } \text{ some } \quad j\in \mathcal {A}$}, Hence, $\{\max \limits _{j\in \mathcal {A}}|\hat{\omega }_j-\omega _j|\le {cn^{-\kappa }}\}\subseteq \{\mathcal {A}\subseteq \mathcal {\hat{A}}\}$. Consequently, for $0\le \alpha <{\frac{1}{2}-\kappa }$, $0<\alpha <\frac{1}{2}$,

$$\begin{aligned} P\left( \mathcal {A}\subseteq \mathcal {\hat{A}}\right)\ge & {} P\left( \max \limits _{j\in \mathcal {A}}|\hat{\omega }_j-\omega _j|\le {cn^{-\kappa }}\right) =1-P\left( \max \limits _{j\in \mathcal {A}}|\hat{\omega }_j-\omega _j|>cn^{-\kappa }\right) \\\ge & {} 1-s_nP\left( |\hat{\omega }_j-\omega _j|>cn^{-\kappa }\right) \\\ge & {} 1-O\left( s_{n}(n^{\beta +\alpha })\exp \left\{ -cn^{1-2\alpha -2\kappa }\right\} \right) , \end{aligned}$$

where $s_n$ is the cardinality of $\mathcal {A}$. This completes the proof of the Theorem 1. $\square $

Proof of Theorem 2

If $\delta =\min \limits _{j\in \mathcal {A}}\omega _j-\max \limits _{j\in \mathcal {I}}\omega _{j}>0$, we can get

$$\begin{aligned} P\left( \min \limits _{j\in \mathcal {A}}\hat{\omega }_j>\max \limits _{j\in \mathcal {I}}\hat{\omega }_{j}\right)= & {} 1-P\left( \min \limits _{j\in \mathcal {A}}\hat{\omega }_j\le \max \limits _{j\in \mathcal {I}}\hat{\omega }_{j}\right) \\= & {} 1-P\left( \min \limits _{j\in \mathcal {A}}\hat{\omega }_j-\min \limits _{j\in \mathcal {A}}\omega _j+\delta \le \max \limits _{j\in \mathcal {I}}\hat{\omega }_{j}-\max \limits _{j\in \mathcal {I}}\omega _{j}\right) \\\ge & {} 1-P\left( \max \limits _{j\in \mathcal {I}}|\hat{\omega }_{j}-\omega _{j}|\ge \frac{\delta }{2}\right) -P\left( \max \limits _{j\in \mathcal {A}}|\hat{\omega }_{j}-\omega _{j}|\ge \frac{\delta }{2}\right) \\\ge & {} 1-O\left( p(n^{\beta +\alpha })\exp \left\{ -c\delta ^{2}n^{1-2\alpha }\right\} \right) . \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, F., Lai, P. & Shen, B. Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis. Metrika 83, 799–820 (2020). https://doi.org/10.1007/s00184-019-00758-x

Download citation

Received: 03 September 2018
Published: 06 January 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00184-019-00758-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Unified mean-variance feature screening for ultrahigh-dimensional regression

Robust rank screening for ultrahigh dimensional discriminant analysis

A note on quantile feature screening via distance correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 1

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Unified mean-variance feature screening for ultrahigh-dimensional regression

Robust rank screening for ultrahigh dimensional discriminant analysis

A note on quantile feature screening via distance correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 1

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation