当前位置: X-MOL 学术Front. Neurorobotics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Embodied Synaptic Plasticity With Online Reinforcement Learning.
Frontiers in Neurorobotics ( IF 2.6 ) Pub Date : 2019-10-03 , DOI: 10.3389/fnbot.2019.00081
Jacques Kaiser 1 , Michael Hoff 1, 2 , Andreas Konle 1 , J Camilo Vasquez Tieck 1 , David Kappel 2, 3, 4 , Daniel Reichard 1 , Anand Subramoney 2 , Robert Legenstein 2 , Arne Roennau 1 , Wolfgang Maass 2 , Rüdiger Dillmann 1
Affiliation  

The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robotics closer together by integrating open-source software components from these two fields. The resulting framework allows to evaluate the validity of biologically-plausibe plasticity models in closed-loop robotics environments. We demonstrate this framework to evaluate Synaptic Plasticity with Online REinforcement learning (SPORE), a reward-learning rule based on synaptic sampling, on two visuomotor tasks: reaching and lane following. We show that SPORE is capable of learning to perform policies within the course of simulated hours for both tasks. Provisional parameter explorations indicate that the learning rate and the temperature driving the stochastic processes that govern synaptic learning dynamics need to be regulated for performance improvements to be retained. We conclude by discussing the recent deep reinforcement learning techniques which would be beneficial to increase the functionality of SPORE on visuomotor tasks.

中文翻译:

通过在线强化学习体现突触可塑性。

了解大脑的努力涉及多个合作研究领域。传统上,理论神经科学家推导出的突触可塑性规则是在模式分类任务上单独评估的。这与生物大脑形成鲜明对比,生物大脑的目的是闭环控制身体。本文通过集成计算神经科学和机器人学领域的开源软件组件,有助于将这两个领域更加紧密地结合在一起。由此产生的框架可以评估闭环机器人环境中生物学上合理的可塑性模型的有效性。我们演示了这个框架,通过在线强化学习(SPORE)评估突触可塑性,这是一种基于突触采样的奖励学习规则,适用于两个视觉运动任务:到达和车道跟随。我们证明 SPORE 能够在模拟时间内学习执行这两项任务的策略。临时参数探索表明,需要调节控制突触学习动态的随机过程的学习速率和温度,以保持性能改进。最后,我们讨论了最近的深度强化学习技术,这些技术将有利于增强 SPORE 在视觉运动任务上的功能。
更新日期:2019-11-01
down
wechat
bug