Distributional Reinforcement Learning in the Brain,Trends in Neurosciences

当前位置： X-MOL 学术 › Trends Neurosci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Distributional Reinforcement Learning in the Brain
Trends in Neurosciences ( IF 14.6 ) Pub Date : 2020-12-01 , DOI: 10.1016/j.tins.2020.09.004
Adam S Lowet ₁ , Qiao Zheng ₂ , Sara Matias ₁ , Jan Drugowitsch ₂ , Naoshige Uchida ₁

Affiliation

Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in the mammalian midbrain and the reward prediction errors of reinforcement learning algorithms, which express the difference between actual reward and predicted mean reward. However, it may be advantageous to learn not only the mean but also the complete distribution of potential rewards. Recent advances in machine learning have revealed a biologically plausible set of algorithms for reconstructing this reward distribution from experience. Here, we review the mathematical foundations of these algorithms as well as initial evidence for their neurobiological implementation. We conclude by highlighting outstanding questions regarding the circuit computation and behavioral readout of these distributional codes.

中文翻译：

大脑中的分布式强化学习

了解奖励和惩罚对于生存至关重要。经典研究表明，哺乳动物中脑中多巴胺神经元的放电与强化学习算法的奖励预测误差之间存在令人印象深刻的对应关系，它表达了实际奖励与预测平均奖励之间的差异。然而，不仅了解潜在奖励的平均值，而且了解潜在奖励的完整分布可能是有利的。机器学习的最新进展揭示了一组生物学上合理的算法，用于根据经验重建这种奖励分布。在这里，我们回顾这些算法的数学基础以及它们的神经生物学实现的初步证据。最后，我们强调了有关这些分布式代码的电路计算和行为读出的突出问题。

更新日期：2020-12-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11