当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fast and efficient smoothing approach to Lasso regression and an application in statistical genetics: polygenic risk scores for chronic obstructive pulmonary disease (COPD)
Statistics and Computing ( IF 1.6 ) Pub Date : 2021-04-17 , DOI: 10.1007/s11222-021-10010-0
Georg Hahn , Sharon M. Lutz , Nilanjana Laha , Michael H. Cho , Edwin K. Silverman , Christoph Lange

High dimensional linear regression problems are often fitted using Lasso approaches. Although the Lasso objective function is convex, it is not differentiable everywhere, making the use of gradient descent methods for minimization not straightforward. To avoid this technical issue, we apply Nesterov smoothing to the original (unsmoothed) Lasso objective function. We introduce a closed-form smoothed Lasso which preserves the convexity of the Lasso function, is uniformly close to the unsmoothed Lasso, and allows us to obtain closed-form derivatives everywhere for efficient and fast minimization via gradient descent. Our simulation studies are focused on polygenic risk scores using genetic data from a genome-wide association study (GWAS) for chronic obstructive pulmonary disease (COPD). We compare accuracy and runtime of our approach to the current gold standard in the literature, the FISTA algorithm. Our results suggest that the proposed methodology provides estimates with equal or higher accuracy than the FISTA algorithm while having the same asymptotic runtime scaling. The proposed methodology is implemented in the R-package smoothedLasso, available on the Comprehensive R Archive Network (CRAN).



中文翻译:

快速有效的套索回归平滑方法及其在统计遗传学中的应用:慢性阻塞性肺疾病(COPD)的多基因风险评分

通常使用套索方法拟合高维线性回归问题。尽管套索目标函数是凸函数,但它并不是随处可微的,因此使用梯度下降方法进行最小化并不容易。为避免此技术问题,我们将Nesterov平滑应用于原始(不平滑的)套索目标函数。我们介绍了一种封闭形式的平滑套索,该套索保留了套索函数的凸性,均匀地接近于不平滑的套索,并允许我们通过梯度下降随时随地获得封闭形式的导数,以进行有效而快速的最小化。我们的模拟研究使用来自慢性阻塞性肺疾病(COPD)的全基因组关联研究(GWAS)的遗传数据,着重于多基因风险评分。我们将我们的方法与文献中当前的金标准FISTA算法的准确性和运行时间进行比较。我们的结果表明,在具有相同渐近运行时间缩放的同时,所提出的方法可提供与FISTA算法相同或更高的估计精度。建议的方法在R-package中实现smoothedLasso,在综合R存档网络(CRAN)上可用。

更新日期:2021-04-18
down
wechat
bug