A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning,Nature Neuroscience

当前位置： X-MOL 学术 › Nat. Neurosci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning
Nature Neuroscience ( IF 21.2 ) Pub Date : 2022-07-07 , DOI: 10.1038/s41593-022-01109-2
Ryunosuke Amo ₁ , Sara Matias ₁ , Akihiro Yamanaka ₂ , Kenji F Tanaka ₃ , Naoshige Uchida ₁ , Mitsuko Watabe-Uchida ₁

Affiliation

A large body of evidence has indicated that the phasic responses of midbrain dopamine neurons show a remarkable similarity to a type of teaching signal (temporal difference (TD) error) used in machine learning. However, previous studies failed to observe a key prediction of this algorithm: that when an agent associates a cue and a reward that are separated in time, the timing of dopamine signals should gradually move backward in time from the time of the reward to the time of the cue over multiple trials. Here we demonstrate that such a gradual shift occurs both at the level of dopaminergic cellular activity and dopamine release in the ventral striatum in mice. Our results establish a long-sought link between dopaminergic activity and the TD learning algorithm, providing fundamental insights into how the brain associates cues and rewards that are separated in time.

中文翻译：

多巴胺反应的逐渐时间变化反映了机器学习中时间差异误差的进展

大量证据表明，中脑多巴胺神经元的相位反应与机器学习中使用的一种教学信号（时间差（TD）误差）非常相似。然而，之前的研究未能观察到该算法的一个关键预测：当智能体将时间上分开的提示和奖励关联起来时，多巴胺信号的时间应该从奖励时间到奖励时间逐渐向后移动。多次试验中的提示。在这里，我们证明这种逐渐的转变发生在小鼠腹侧纹状体的多巴胺能细胞活性和多巴胺释放水平上。我们的结果在多巴胺能活动和 TD 学习算法之间建立了长期寻求的联系，为大脑如何将时间上分开的提示和奖励关联起来提供了基本见解。

更新日期：2022-07-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11