Bayesian Variable Selection in a Million Dimensions,arXiv - STAT - Methodology

当前位置： X-MOL 学术 › arXiv.stat.ME › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Bayesian Variable Selection in a Million Dimensions
arXiv - STAT - Methodology Pub Date : 2022-08-02 , DOI: arxiv-2208.01180
Martin Jankowiak

Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been hampered by computational challenges, especially in difficult regimes with a large number of covariates P or non-conjugate likelihoods. To scale to the large P regime we introduce an efficient MCMC scheme whose cost per iteration is sublinear in P. In addition we show how this scheme can be extended to generalized linear models for count data, which are prevalent in biology, ecology, economics, and beyond. In particular we design efficient algorithms for variable selection in binomial and negative binomial regression, which includes logistic regression as a special case. In experiments we demonstrate the effectiveness of our methods, including on cancer and maize genomic data.

中文翻译：

百万维贝叶斯变量选择

贝叶斯变量选择是一种强大的数据分析工具，因为它提供了一种考虑先验信息和不确定性的变量选择原则方法。然而，贝叶斯变量选择的更广泛采用受到计算挑战的阻碍，特别是在具有大量协变量 P 或非共轭似然的困难方案中。为了扩展到大的 P 方案，我们引入了一种有效的 MCMC 方案，其每次迭代的成本在 P 中是次线性的。此外，我们展示了如何将该方案扩展到计数数据的广义线性模型，这些模型在生物学、生态学、经济学、超越。特别是，我们为二项式和负二项式回归中的变量选择设计了有效的算法，其中包括逻辑回归作为特例。

更新日期：2022-08-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文