当前位置: X-MOL 学术Neural Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sophisticated Inference
Neural Computation ( IF 2.9 ) Pub Date : 2021-02-24 , DOI: 10.1162/neco_a_01351
Karl Friston 1 , Lancelot Da Costa 2 , Danijar Hafner 3 , Casper Hesp 4 , Thomas Parr 1
Affiliation  

Active inference offers a first principle account of sentient behavior, from which special and important cases—for example, reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design—can be derived. Active inference finesses the exploitation-exploration dilemma in relation to prior preferences by placing information gain on the same footing as reward or value. In brief, active inference replaces value functions with functionals of (Bayesian) beliefs, in the form of an expected (variational) free energy. In this letter, we consider a sophisticated kind of active inference using a recursive form of expected free energy. Sophistication describes the degree to which an agent has beliefs about beliefs. We consider agents with beliefs about the counterfactual consequences of action for states of affairs and beliefs about those latent states. In other words, we move from simply considering beliefs about “what would happen if I did that” to “what I would believe about what would happen if I did that.” The recursive form of the free energy functional effectively implements a deep tree search over actions and outcomes in the future. Crucially, this search is over sequences of belief states as opposed to states per se. We illustrate the competence of this scheme using numerical simulations of deep decision problems.



中文翻译:

复杂的推理

主动推理提供了有情行为的第一原理说明,从中可以推导出特殊和重要的情况——例如强化学习、主动学习、贝叶斯最优推理、贝叶斯最优设计。主动推理通过将信息增益置于与奖励或价值相同的基础上,巧妙地解决了与先验偏好相关的开发探索困境。简而言之,主动推理以预期(变分)自由能的形式,用(贝叶斯)信念的泛函代替价值函数。在这封信中,我们考虑使用递归形式的预期自由能进行一种复杂的主动推理。复杂性描述了代理对信念的信念程度。我们考虑对行为对事态的反事实后果有信念的代理人以及对那些潜在状态的信念。换句话说,我们从简单地考虑“如果我这样做会发生什么”的信念转变为“如果我这样做会发生什么,我会相信什么”。自由能泛函的递归形式有效地实现了对未来行动和结果的深度树搜索。至关重要的是,这种搜索是针对信念状态序列而不是状态本身。我们使用深度决策问题的数值模拟来说明该方案的能力。

更新日期:2021-02-25
down
wechat
bug