当前位置: X-MOL 学术Stat. Anal. Data Min. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A modified least angle regression algorithm for interaction selection with heredity
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2022-02-28 , DOI: 10.1002/sam.11577
Woosung Kim 1 , Seonghyeon Kim 2 , Myung Hwan Na 3 , Yongdai Kim 2
Affiliation  

In many practical problems, the main effects alone may not be enough to capture the relationship between the response and predictors, and the interaction effects are often of interest to scientific researchers. In considering a regression model with main effects and all possible two-way interaction effects, which we call the two-way interaction model, there is an important challenge—computational burden. One way to reduce the aforementioned problems is to consider the heredity constraint between the main and interaction effects. The heredity constraint assumes that a given interaction effect is significant only when the corresponding main effects are significant. Various sparse penalized methods to reflect the heredity constraint have been proposed, but those algorithms are still computationally demanding and can be applied to data where the dimension of the main effects is only few hundreds. In this paper, we propose a modification of the LARS algorithm for selecting interaction effects under the heredity constraint, which can be applied to high-dimensional data. Our numerical studies confirm that the proposed modified LARS algorithm is much faster and spends less memory than its competitors but has comparable prediction accuracies when the dimension of covariates is large.

中文翻译:

遗传交互选择的改进最小角度回归算法

在许多实际问题中,仅靠主效应可能不足以捕捉响应变量和预测变量之间的关系,而交互效应往往是科学研究人员感兴趣的。在考虑具有主效应和所有可能的双向交互效应的回归模型(我们称之为双向交互模型)时,存在一个重要的挑战——计算负担。减少上述问题的一种方法是考虑主效应和交互效应之间的遗传约束。遗传约束假设给定的交互作用仅在相应的主效应显着时才显着。已经提出了各种反映遗传约束的稀疏惩罚方法,但是这些算法仍然对计算要求很高,并且可以应用于主效应维度只有几百个的数据。在本文中,我们提出了一种改进的 LARS 算法,用于在遗传约束下选择交互效果,可应用于高维数据。我们的数值研究证实,所提出的修改后的 LARS 算法比其竞争对手要快得多,消耗的内存更少,但在协变量的维度很大时具有相当的预测精度。
更新日期:2022-02-28
down
wechat
bug