当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving robot dual-system motor learning with intrinsically motivated meta-control and latent-space experience imagination
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.robot.2020.103630
Muhammad Burhan Hafez , Cornelius Weber , Matthias Kerzel , Stefan Wermter

Abstract Combining model-based and model-free learning systems has been shown to improve the sample efficiency of learning to perform complex robotic tasks. However, dual-system approaches fail to consider the reliability of the learned model when it is applied to make multiple-step predictions, resulting in a compounding of prediction errors and performance degradation. In this paper, we present a novel dual-system motor learning approach where a meta-controller arbitrates online between model-based and model-free decisions based on an estimate of the local reliability of the learned model. The reliability estimate is used in computing an intrinsic feedback signal, encouraging actions that lead to data that improves the model. Our approach also integrates arbitration with imagination where a learned latent-space model generates imagined experiences, based on its local reliability, to be used as additional training data. We evaluate our approach against baseline and state-of-the-art methods on learning vision-based robotic grasping in simulation and real world. The results show that our approach outperforms the compared methods and learns near-optimal grasping policies in dense- and sparse-reward environments.

中文翻译:

用内在动机元控制和潜在空间体验想象改善机器人双系统运动学习

摘要 结合基于模型和无模型的学习系统已被证明可以提高学习执行复杂机器人任务的样本效率。然而,双系统方法在应用于多步预测时没有考虑学习模型的可靠性,导致预测错误和性能下降的复合。在本文中,我们提出了一种新的双系统运动学习方法,其中元控制器基于对学习模型的局部可靠性的估计,在基于模型和无模型的决策之间进行在线仲裁。可靠性估计用于计算内在反馈信号,鼓励导致改进模型的数据的行动。我们的方法还将仲裁与想象相结合,其中学习的潜在空间模型基于其局部可靠性生成想象的经验,用作额外的训练数据。我们根据在模拟和现实世界中学习基于视觉的机器人抓取的基线和最先进的方法来评估我们的方法。结果表明,我们的方法优于比较方法,并在密集和稀疏奖励环境中学习接近最佳的抓取策略。
更新日期:2020-11-01
down
wechat
bug