当前位置:
X-MOL 学术
›
Ann. Inst. Stat. Math.
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On constrained and regularized high-dimensional regression
Annals of the Institute of Statistical Mathematics ( IF 1 ) Pub Date : 2013-01-12 , DOI: 10.1007/s10463-012-0396-3 Xiaotong Shen 1 , Wei Pan 2 , Yunzhang Zhu 1 , Hui Zhou 2
Annals of the Institute of Statistical Mathematics ( IF 1 ) Pub Date : 2013-01-12 , DOI: 10.1007/s10463-012-0396-3 Xiaotong Shen 1 , Wei Pan 2 , Yunzhang Zhu 1 , Hui Zhou 2
Affiliation
High-dimensional feature selection has become increasingly crucial for seeking parsimonious models in estimation. For selection consistency, we derive one necessary and sufficient condition formulated on the notion of degree of separation. The minimal degree of separation is necessary for any method to be selection consistent. At a level slightly higher than the minimal degree of separation, selection consistency is achieved by a constrained $$L_0$$-method and its computational surrogate—the constrained truncated $$L_1$$-method. This permits up to exponentially many features in the sample size. In other words, these methods are optimal in feature selection against any selection method. In contrast, their regularization counterparts—the $$L_0$$-regularization and truncated $$L_1$$-regularization methods enable so under slightly stronger assumptions. More importantly, sharper parameter estimation/prediction is realized through such selection, leading to minimax parameter estimation. This, otherwise, is impossible in the absence of a good selection method for high-dimensional analysis.
中文翻译:
关于约束和正则化的高维回归
高维特征选择对于在估计中寻找简约模型变得越来越重要。为了选择的一致性,我们推导出一个基于分离度概念的充分必要条件。任何方法都需要最小程度的分离才能选择一致。在略高于最小分离度的级别上,选择一致性是通过约束 $$L_0$$-方法及其计算代理——约束截断 $$L_1$$-方法实现的。这允许样本大小中多达指数级的许多特征。换句话说,这些方法在特征选择方面相对于任何选择方法都是最优的。相比之下,它们的正则化对应物——$$L_0$$-regularization 和截断的 $$L_1$$-regularization 方法在稍强的假设下实现了这一点。更重要的是,通过这样的选择实现了更清晰的参数估计/预测,从而实现了极大极小参数估计。否则,在缺乏高维分析的良好选择方法的情况下,这是不可能的。
更新日期:2013-01-12
中文翻译:
关于约束和正则化的高维回归
高维特征选择对于在估计中寻找简约模型变得越来越重要。为了选择的一致性,我们推导出一个基于分离度概念的充分必要条件。任何方法都需要最小程度的分离才能选择一致。在略高于最小分离度的级别上,选择一致性是通过约束 $$L_0$$-方法及其计算代理——约束截断 $$L_1$$-方法实现的。这允许样本大小中多达指数级的许多特征。换句话说,这些方法在特征选择方面相对于任何选择方法都是最优的。相比之下,它们的正则化对应物——$$L_0$$-regularization 和截断的 $$L_1$$-regularization 方法在稍强的假设下实现了这一点。更重要的是,通过这样的选择实现了更清晰的参数估计/预测,从而实现了极大极小参数估计。否则,在缺乏高维分析的良好选择方法的情况下,这是不可能的。