当前位置:
X-MOL 学术
›
arXiv.cs.RO
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Robotic Model of Hippocampal Reverse Replay for Reinforcement Learning
arXiv - CS - Robotics Pub Date : 2021-02-23 , DOI: arxiv-2102.11914 Matthew T. Whelan, Tony J. Prescott, Eleni Vasilaki
arXiv - CS - Robotics Pub Date : 2021-02-23 , DOI: arxiv-2102.11914 Matthew T. Whelan, Tony J. Prescott, Eleni Vasilaki
Hippocampal reverse replay is thought to contribute to learning, and
particularly reinforcement learning, in animals. We present a computational
model of learning in the hippocampus that builds on a previous model of the
hippocampal-striatal network viewed as implementing a three-factor
reinforcement learning rule. To augment this model with hippocampal reverse
replay, a novel policy gradient learning rule is derived that associates place
cell activity with responses in cells representing actions. This new model is
evaluated using a simulated robot spatial navigation task inspired by the
Morris water maze. Results show that reverse replay can accelerate learning
from reinforcement, whilst improving stability and robustness over multiple
trials. As implied by the neurobiological data, our study implies that reverse
replay can make a significant positive contribution to reinforcement learning,
although learning that is less efficient and less stable is possible in its
absence. We conclude that reverse replay may enhance reinforcement learning in
the mammalian hippocampal-striatal system rather than provide its core
mechanism.
中文翻译:
用于强化学习的海马反向重放机器人模型
海马反向重放被认为有助于动物的学习,尤其是强化学习。我们提出了一个海马学习的计算模型,该模型建立在先前的海马-纹状体网络模型之上,该模型被视为实施了三要素强化学习规则。为了通过海马反向重放来增强此模型,派生了一种新颖的策略梯度学习规则,该规则将位置细胞活动与代表动作的细胞中的响应相关联。此新模型是使用受Morris水迷宫启发的模拟机器人空间导航任务进行评估的。结果表明,反向重播可以加速从强化中学习,同时在多个试验中提高稳定性和鲁棒性。正如神经生物学数据所暗示的那样,我们的研究表明,反向重放可以对强化学习做出重要的积极贡献,尽管在缺乏这种学习的情况下,效率较低,稳定性较差的学习也是可能的。我们得出结论,反向重放可能会增强哺乳动物海马-纹状体系统中的强化学习,而不是提供其核心机制。
更新日期:2021-02-25
中文翻译:
用于强化学习的海马反向重放机器人模型
海马反向重放被认为有助于动物的学习,尤其是强化学习。我们提出了一个海马学习的计算模型,该模型建立在先前的海马-纹状体网络模型之上,该模型被视为实施了三要素强化学习规则。为了通过海马反向重放来增强此模型,派生了一种新颖的策略梯度学习规则,该规则将位置细胞活动与代表动作的细胞中的响应相关联。此新模型是使用受Morris水迷宫启发的模拟机器人空间导航任务进行评估的。结果表明,反向重播可以加速从强化中学习,同时在多个试验中提高稳定性和鲁棒性。正如神经生物学数据所暗示的那样,我们的研究表明,反向重放可以对强化学习做出重要的积极贡献,尽管在缺乏这种学习的情况下,效率较低,稳定性较差的学习也是可能的。我们得出结论,反向重放可能会增强哺乳动物海马-纹状体系统中的强化学习,而不是提供其核心机制。