The computational asymptotics of Gaussian variational inference and the Laplace approximation,Statistics and Computing

当前位置： X-MOL 学术 › Stat. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The computational asymptotics of Gaussian variational inference and the Laplace approximation
Statistics and Computing ( IF 1.6 ) Pub Date : 2022-08-09 , DOI: 10.1007/s11222-022-10125-y
Zuheng Xu , Trevor Campbell

Gaussian variational inference and the Laplace approximation are popular alternatives to Markov chain Monte Carlo that formulate Bayesian posterior inference as an optimization problem, enabling the use of simple and scalable stochastic optimization algorithms. However, a key limitation of both methods is that the solution to the optimization problem is typically not tractable to compute; even in simple settings, the problem is nonconvex. Thus, recently developed statistical guarantees—which all involve the (data) asymptotic properties of the global optimum—are not reliably obtained in practice. In this work, we provide two major contributions: a theoretical analysis of the asymptotic convexity properties of variational inference with a Gaussian family and the maximum a posteriori (MAP) problem required by the Laplace approximation, and two algorithms—consistent Laplace approximation (CLA) and consistent stochastic variational inference (CSVI)—that exploit these properties to find the optimal approximation in the asymptotic regime. Both CLA and CSVI involve a tractable initialization procedure that finds the local basin of the optimum, and CSVI further includes a scaled gradient descent algorithm that provably stays locally confined to that basin. Experiments on nonconvex synthetic and real-data examples show that compared with standard variational and Laplace approximations, both CSVI and CLA improve the likelihood of obtaining the global optimum of their respective optimization problems.

中文翻译：

高斯变分推理的计算渐近和拉普拉斯逼近

高斯变分推理和拉普拉斯近似是马尔可夫链蒙特卡洛的流行替代方案，它们将贝叶斯后验推理公式化为优化问题，从而能够使用简单且可扩展的随机优化算法。然而，这两种方法的一个关键限制是优化问题的解决方案通常难以计算。即使在简单的设置中，问题也是非凸的。因此，最近开发的统计保证——所有这些都涉及全局最优的（数据）渐近特性——在实践中并不能可靠地获得。在这项工作中，我们提供了两个主要贡献：对具有高斯族的变分推理的渐近凸性特性的理论分析和拉普拉斯近似所需的最大后验 (MAP) 问题，一致的拉普拉斯逼近(CLA) 和一致的随机变分推断(CSVI)——利用这些属性在渐近状态中找到最佳逼近。CLA 和 CSVI 都涉及一个易于处理的初始化过程，该过程可以找到最佳的局部盆地，并且 CSVI 还包括一个缩放梯度下降算法，该算法可证明局部地局限于该盆地。非凸合成实例和真实数据实例的实验表明，与标准变分近似和拉普拉斯近似相比，CSVI 和 CLA 都提高了获得各自优化问题的全局最优值的可能性。

更新日期：2022-08-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11