当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable Bayesian regression in high dimensions with multiple data sources
Journal of Computational and Graphical Statistics ( IF 1.4 ) Pub Date : 2019-07-15 , DOI: 10.1080/10618600.2019.1624294
Konstantinos Perrakis 1 , Sach Mukherjee 1 , The Alzheimer’s Disease Neuroimaging Initiative
Affiliation  

Abstract Applications of high-dimensional regression often involve multiple sources or types of covariates. We propose methodology for this setting, emphasizing the “wide data” regime with large total dimensionality p and sample size . We focus on a flexible ridge-type prior with shrinkage levels that are specific to each data type or source and that are set automatically by empirical Bayes. All estimation, including setting of shrinkage levels, is formulated mainly in terms of inner product matrices of size . This renders computation efficient in the wide data regime and allows scaling to problems with millions of features. Furthermore, the proposed procedures are free of user-set tuning parameters. We show how sparsity can be achieved by post-processing of the Bayesian output via constrained minimization of a certain Kullback–Leibler divergence. This yields sparse solutions with adaptive, source-specific shrinkage, including a closed-form variant that scales to very large p. We present empirical results from a simulation study based on real data and a case study in Alzheimer’s disease involving millions of features and multiple data sources. Supplementary materials for this article are available online.

中文翻译:

具有多个数据源的高维可扩展贝叶斯回归

摘要 高维回归的应用通常涉及多种来源或类型的协变量。我们为这种设置提出了方法论,强调具有大总维数 p 和样本大小的“宽数据”制度。我们专注于灵活的脊型先验,收缩级别特定于每种数据类型或源,并由经验贝叶斯自动设置。所有估计,包括收缩水平的设置,主要是根据 size 的内积矩阵制定的。这使得计算在广泛的数据范围内高效,并允许扩展到具有数百万个特征的问题。此外,所提出的程序没有用户设置的调整参数。我们展示了如何通过约束最小化某个 Kullback-Leibler 散度对贝叶斯输出进行后处理来实现稀疏性。这会产生具有自适应、特定于源的收缩的稀疏解决方案,包括可扩展到非常大 p 的封闭形式变体。我们展示了基于真实数据的模拟研究和涉及数百万特征和多个数据源的阿尔茨海默病案例研究的实证结果。本文的补充材料可在线获取。
更新日期:2019-07-15
down
wechat
bug