A scalable surrogate L0 sparse regression method for generalized linear models with applications to large scale data,Journal of Statistical Planning and Inference

当前位置： X-MOL 学术 › J. Stat. Plann. Inference › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A scalable surrogate L0 sparse regression method for generalized linear models with applications to large scale data
Journal of Statistical Planning and Inference ( IF 0.8 ) Pub Date : 2021-07-01 , DOI: 10.1016/j.jspi.2020.12.001
Ning Li , Xiaoling Peng , Eric Kawaguchi , Marc A. Suchard , Gang Li

Abstract This paper rigorously studies large sample properties of a surrogate L 0 penalization method via iteratively performing reweighted L 2 penalized regressions for generalized linear models and develop a scalable implementation of the method for sparse high dimensional massive sample size (sHDMSS) data. We show that for generalized linear models, the limit of the algorithm, referred to as the broken adaptive ridge (BAR) estimator, is consistent for variable selection, enjoys an oracle property for parameter estimation, and possesses a grouping property for highly correlated covariates. We further demonstrate that by taking advantage of an existing efficient implementation of massive L 2 -penalized generalized linear models, the proposed BAR method can be conveniently implemented for sHDMSS data. An illustration is given using a large sHDMSS data from the Truven MarketScan Medicare (MDCR) database to investigate the safety of dabigatran verus warfarin for treatment of nonvalvular atrial filbrillation in elder patients.

中文翻译：

适用于大规模数据的广义线性模型的可扩展替代 L0 稀疏回归方法

摘要本文通过对广义线性模型迭代执行重加权 L 2 惩罚回归，严格研究了替代 L 0 惩罚方法的大样本特性，并开发了该方法的可扩展实现，用于稀疏高维大规模样本大小 (sHDMSS) 数据。我们表明，对于广义线性模型，算法的极限，称为破碎自适应脊（BAR）估计器，对于变量选择是一致的，对于参数估计具有预言性，并且对于高度相关的协变量具有分组特性。我们进一步证明，通过利用大量 L 2 惩罚广义线性模型的现有有效实现，可以方便地为 sHDMSS 数据实现所提出的 BAR 方法。

更新日期：2021-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11