Which bridge estimator is optimal for variable selection?

Wang, Shuaiwen; Weng, Haolei; Maleki, Arian

Mathematics > Statistics Theory

arXiv:1705.08617 (math)

[Submitted on 24 May 2017 (v1), last revised 25 Mar 2020 (this version, v4)]

Title:Which bridge estimator is optimal for variable selection?

Authors:Shuaiwen Wang, Haolei Weng, Arian Maleki

View PDF

Abstract:We study the problem of variable selection for linear models under the high-dimensional asymptotic setting, where the number of observations $n$ grows at the same rate as the number of predictors $p$. We consider two-stage variable selection techniques (TVS) in which the first stage uses bridge estimators to obtain an estimate of the regression coefficients, and the second stage simply thresholds this estimate to select the "important" predictors. The asymptotic false discovery proportion (AFDP) and true positive proportion (ATPP) of these TVS are evaluated. We prove that for a fixed ATPP, in order to obtain a smaller AFDP, one should pick a bridge estimator with smaller asymptotic mean square error in the first stage of TVS. Based on such principled discovery, we present a sharp comparison of different TVS, via an in-depth investigation of the estimation properties of bridge estimators. Rather than "order-wise" error bounds with loose constants, our analysis focuses on precise error characterization. Various interesting signal-to-noise ratio and sparsity settings are studied. Our results offer new and thorough insights into high-dimensional variable selection. For instance, we prove that a TVS with Ridge in its first stage outperforms TVS with other bridge estimators in large noise settings; two-stage LASSO becomes inferior when the signal is rare and weak. As a by-product, we show that two-stage methods outperform some standard variable selection techniques, such as LASSO and Sure Independence Screening, under certain conditions.

Comments:	84 pages, 11 figures
Subjects:	Statistics Theory (math.ST); Information Theory (cs.IT)
Cite as:	arXiv:1705.08617 [math.ST]
	(or arXiv:1705.08617v4 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.1705.08617

Submission history

From: Shuaiwen Wang [view email]
[v1] Wed, 24 May 2017 05:47:46 UTC (402 KB)
[v2] Mon, 30 Jul 2018 03:52:13 UTC (428 KB)
[v3] Tue, 12 Mar 2019 21:17:56 UTC (605 KB)
[v4] Wed, 25 Mar 2020 20:57:14 UTC (825 KB)

Mathematics > Statistics Theory

Title:Which bridge estimator is optimal for variable selection?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:Which bridge estimator is optimal for variable selection?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators