Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings,Statistica Sinica

当前位置： X-MOL 学术 › Stat. Sin. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings
Statistica Sinica ( IF 1.4 ) Pub Date : 2018-01-01 , DOI: 10.5705/ss.202016.0167
Minsuk Shin ₁ , Anirban Bhattacharya ₁ , Valen E Johnson ₁

Affiliation

Bayesian model selection procedures based on nonlocal alternative prior densities are extended to ultrahigh dimensional settings and compared to other variable selection procedures using precision-recall curves. Variable selection procedures included in these comparisons include methods based on g-priors, reciprocal lasso, adaptive lasso, scad, and minimax concave penalty criteria. The use of precision-recall curves eliminates the sensitivity of our conclusions to the choice of tuning parameters. We find that Bayesian variable selection procedures based on nonlocal priors are competitive to all other procedures in a range of simulation scenarios, and we subsequently explain this favorable performance through a theoretical examination of their consistency properties. When certain regularity conditions apply, we demonstrate that the nonlocal procedures are consistent for linear models even when the number of covariates p increases sub-exponentially with the sample size n. A model selection procedure based on Zellner's g-prior is also found to be competitive with penalized likelihood methods in identifying the true model, but the posterior distribution on the model space induced by this method is much more dispersed than the posterior distribution induced on the model space by the nonlocal prior methods. We investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and show that it attains a unique term that cannot be derived from the other Bayesian model selection procedures. We also propose a scalable and efficient algorithm called Simplified Shotgun Stochastic Search with Screening (S5) to explore the enormous model space, and we show that S5 dramatically reduces the computing time without losing the capacity to search the interesting region in the model space, at least in the simulation settings considered. The S5 algorithm is available in an R package BayesS5 on CRAN.

中文翻译：

在超高维设置中使用非局部先验密度的可伸缩贝叶斯变量选择

基于非局部替代先验密度的贝叶斯模型选择程序扩展到超高维设置，并与使用精确召回曲线的其他变量选择程序进行比较。这些比较中包含的变量选择程序包括基于 g-priors、倒数套索、自适应套索、scad 和 minimax 凹面惩罚标准的方法。精确召回曲线的使用消除了我们的结论对调整参数选择的敏感性。我们发现基于非局部先验的贝叶斯变量选择程序在一系列模拟场景中与所有其他程序相比具有竞争力，我们随后通过对其一致性属性的理论检查来解释这种有利的性能。当某些规律性条件适用时，我们证明了非局部过程对于线性模型是一致的，即使协变量 p 的数量随着样本大小 n 呈亚指数增加。还发现基于 Zellner 的 g-prior 的模型选择程序在识别真实模型方面与惩罚似然方法具有竞争力，但由该方法诱导的模型空间上的后验分布比模型上诱导的后验分布更加分散空间由非局部先验方法。我们研究了基于非局部先验的边际似然的渐近形式，并表明它获得了一个无法从其他贝叶斯模型选择程序中导出的独特项。我们还提出了一种称为 Simplified Shotgun Stochastic Search with Screening (S5) 的可扩展且高效的算法来探索巨大的模型空间，并且我们表明 S5 显着减少了计算时间而不会失去在模型空间中搜索感兴趣区域的能力，在至少在考虑的模拟设置中。S5 算法在 CRAN 上的 R 包 BayesS5 中可用。

更新日期：2018-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>