Newtonian Monte Carlo: single-site MCMC meets second-order gradient methods,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Newtonian Monte Carlo: single-site MCMC meets second-order gradient methods
arXiv - CS - Machine Learning Pub Date : 2020-01-15 , DOI: arxiv-2001.05567
Nimar S. Arora, Nazanin Khosravani Tehrani, Kinjal Divesh Shah, Michael Tingley, Yucen Lily Li, Narjes Torabi, David Noursi, Sepehr Akhavan Masouleh, Eric Lippert, Erik Meijer

Single-site Markov Chain Monte Carlo (MCMC) is a variant of MCMC in which a single coordinate in the state space is modified in each step. Structured relational models are a good candidate for this style of inference. In the single-site context, second order methods become feasible because the typical cubic costs associated with these methods is now restricted to the dimension of each coordinate. Our work, which we call Newtonian Monte Carlo (NMC), is a method to improve MCMC convergence by analyzing the first and second order gradients of the target density to determine a suitable proposal density at each point. Existing first order gradient-based methods suffer from the problem of determining an appropriate step size. Too small a step size and it will take a large number of steps to converge, while a very large step size will cause it to overshoot the high density region. NMC is similar to the Newton-Raphson update in optimization where the second order gradient is used to automatically scale the step size in each dimension. However, our objective is to find a parameterized proposal density rather than the maxima. As a further improvement on existing first and second order methods, we show that random variables with constrained supports don't need to be transformed before taking a gradient step. We demonstrate the efficiency of NMC on a number of different domains. For statistical models where the prior is conjugate to the likelihood, our method recovers the posterior quite trivially in one step. However, we also show results on fairly large non-conjugate models, where NMC performs better than adaptive first order methods such as NUTS or other inexact scalable inference methods such as Stochastic Variational Inference or bootstrapping.

中文翻译：

牛顿蒙特卡罗：单点 MCMC 满足二阶梯度方法

单站点马尔可夫链蒙特卡罗 (MCMC) 是 MCMC 的一种变体，其中状态空间中的单个坐标在每个步骤中都被修改。结构化关系模型非常适合这种推理方式。在单站点环境中，二阶方法变得可行，因为与这些方法相关的典型三次成本现在仅限于每个坐标的维度。我们的工作，我们称之为牛顿蒙特卡罗 (NMC)，是一种通过分析目标密度的一阶和二阶梯度来确定每个点合适的提议密度来提高 MCMC 收敛性的方法。现有的基于一阶梯度的方法存在确定适当步长的问题。步长太小，需要很多步才能收敛，而非常大的步长会导致它超过高密度区域。NMC 类似于优化中的 Newton-Raphson 更新，其中使用二阶梯度自动缩放每个维度的步长。然而，我们的目标是找到一个参数化的提议密度而不是最大值。作为对现有一阶和二阶方法的进一步改进，我们表明在采取梯度步骤之前不需要转换具有约束支持的随机变量。我们展示了 NMC 在许多不同领域的效率。对于先验与似然共轭的统计模型，我们的方法只需一步就可以非常简单地恢复后验。然而，我们也展示了相当大的非共轭模型的结果，

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文