当前位置: X-MOL 学术Biostatistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The joint lasso: high-dimensional regression for group structured data.
Biostatistics ( IF 1.8 ) Pub Date : 2018-09-05 , DOI: 10.1093/biostatistics/kxy035
Frank Dondelinger 1 , Sach Mukherjee 2 ,
Affiliation  

We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an $\ell_1$ term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer's disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.

中文翻译:


联合套索:组结构化数据的高维回归。



我们考虑对观察子组进行高维回归。我们的工作是由生物医学问题推动的,其中代表疾病亚型的样本子集可能与基础回归模型不同。在高维设置中,由于样本量有限,估计每个子组的不同模型具有挑战性。着眼于子组特定模型可能相似但不一定相同的情况,我们将子组视为相关问题实例并联合估计子组特定回归系数。这是在惩罚框架中完成的,将 $\ell_1$ 项与惩罚子组特定系数之间差异的附加项相结合。这提供了全局稀疏但允许子组之间共享信息的解决方案。我们提出了基于模拟数据并使用阿尔茨海默病、肌萎缩侧索硬化症和癌症数据集进行估计和实证结果的算法。这些示例证明了联合估计可以在预测以及提供特定于子组的稀疏模式方面提供的增益。
更新日期:2020-04-17
down
wechat
bug