当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hybrid safe-strong rules for efficient optimization in lasso-type problems
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.csda.2020.107063
Yaohui Zeng , Tianbao Yang , Patrick Breheny

The lasso model has been widely used for model selection in data mining, machine learning, and high-dimensional statistical analysis. However, with the ultrahigh-dimensional, large-scale data sets now collected in many real-world applications, it is important to develop algorithms to solve the lasso that efficiently scale up to problems of this size. Discarding features from certain steps of the algorithm is a powerful technique for increasing efficiency and addressing the Big Data challenge. In this paper, we propose a family of hybrid safe-strong rules (HSSR) which incorporate safe screening rules into the sequential strong rule (SSR) to remove unnecessary computational burden. In particular, we present two instances of HSSR, namely SSR-Dome and SSR-BEDPP, for the standard lasso problem. We further extend SSR-BEDPP to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. Extensive numerical experiments with synthetic and real data sets are conducted for both the standard lasso and the group lasso problems. Results show that our proposed hybrid rules can substantially outperform existing state-of-the-art rules.

中文翻译:

用于套索类型问题有效优化的混合安全强规则

lasso 模型在数据挖掘、机器学习、高维统计分析中被广泛用于模型选择。然而,随着现在在许多实际应用中收集到的超高维、大规模数据集,开发算法来解决套索以有效地扩展到这种规模的问题是很重要的。从算法的某些步骤中丢弃特征是一种提高效率和解决大数据挑战的强大技术。在本文中,我们提出了一系列混合安全强规则(HSSR),将安全筛选规则合并到顺序强规则(SSR)中,以消除不必要的计算负担。特别是,我们针对标准套索问题提出了两个 HSSR 实例,即 SSR-Dome 和 SSR-BEDPP。我们进一步将 SSR-BEDPP 扩展到弹性网络和组套索问题,以证明混合筛选思想的普遍性。针对标准套索和群套索问题,对合成和真实数据集进行了广泛的数值实验。结果表明,我们提出的混合规则可以大大优于现有的最先进规则。
更新日期:2021-01-01
down
wechat
bug