当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast Bayesian estimation of spatial count data models
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-05-01 , DOI: 10.1016/j.csda.2020.107152
Prateek Bansal , Rico Krueger , Daniel J. Graham

Spatial count data models are used to explain and predict the frequency of phenomena such as traffic accidents in geographically distinct entities such as census tracts or road segments. These models are typically estimated using Bayesian Markov chain Monte Carlo (MCMC) simulation methods, which, however, are computationally expensive and do not scale well to large datasets. Variational Bayes (VB), a method from machine learning, addresses the shortcomings of MCMC by casting Bayesian estimation as an optimisation problem instead of a simulation problem. In this paper, we derive a VB method for posterior inference in negative binomial models with unobserved parameter heterogeneity and spatial dependence. The proposed method uses Polya-Gamma augmentation to deal with the non-conjugacy of the negative binomial likelihood and an integrated non-factorised specification of the variational distribution to capture posterior dependencies. We demonstrate the benefits of the approach using simulated data and real data on youth pedestrian injury counts in the census tracts of New York City boroughs Bronx and Manhattan. The empirical analysis suggests that the VB approach is between 7 and 13 times faster than MCMC on a regular eight-core processor, while offering similar estimation and predictive accuracy. Conditional on the availability of computational resources, the embarrassingly parallel architecture of the proposed VB method can be exploited to further accelerate the estimation by up to 100 times.

中文翻译:

空间计数数据模型的快速贝叶斯估计

空间计数数据模型用于解释和预测诸如人口普查区或路段等地理上不同实体中的交通事故等现象的频率。这些模型通常使用贝叶斯马尔可夫链蒙特卡罗 (MCMC) 模拟方法进行估计,但是,这些方法计算成本高,并且不能很好地扩展到大型数据集。变分贝叶斯 (VB) 是一种机器学习方法,通过将贝叶斯估计转换为优化问题而不是模拟问题来解决 MCMC 的缺点。在本文中,我们推导出了一种 VB 方法,用于在具有未观察到的参数异质性和空间依赖性的负二项式模型中进行后验推断。所提出的方法使用 Polya-Gamma 增强来处理负二项式似然的非共轭性,并使用变分分布的综合非因式分解规范来捕获后验依赖性。我们使用模拟数据和纽约市布朗克斯区和曼哈顿人口普查区青年行人伤害计数的真实数据证明了该方法的好处。经验分析表明,在常规八核处理器上,VB 方法比 MCMC 快 7 到 13 倍,同时提供相似的估计和预测精度。以计算资源的可用性为条件,可以利用所提出的 VB 方法的令人尴尬的并行架构将估计速度进一步加快多达 100 倍。
更新日期:2021-05-01
down
wechat
bug