当前位置: X-MOL 学术Int. J. Intell. Robot. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Model accelerated reinforcement learning for high precision robotic assembly
International Journal of Intelligent Robotics and Applications ( IF 2.1 ) Pub Date : 2020-06-02 , DOI: 10.1007/s41315-020-00138-z
Xin Zhao , Huan Zhao , Pengfei Chen , Han Ding

Peg-in-hole assembly with narrow clearance is a typical robotic contact-rich task in industrial manufacturing. Robot learning allows robots to directly acquire the assembly skills for this task without modeling and recognizing the complex contact states. However, learning such skills is still challenging for robot because of the difficulties in collecting massive transitions data and transferring skills to new tasks, which inevitably leads to low training efficiency. This paper formulated the assembly task as a Markov decision process, and proposed a model accelerated reinforcement learning method to efficiently learn assembly policy. In this method, the assembly policy is learned with the maximum entropy reinforcement learning framework and executed with an impedance controller, which ensures exploration efficiency meanwhile allows transferring skills between tasks. To reduce sample complexity and improve training efficiency, the proposed method learns the environment dynamics with Gaussian Process while training policy, then, the learned dynamic model is utilized to improve target value estimation and generate virtual data to argument transition samples. This method can robustly learn assembly skills while minimizing real-world interaction requirements which makes it suitable for realistic assembly scenarios. To verify the proposed method, experiments on an industrial robot are conducted, and the results demonstrate that the proposed method improves the training efficiency by 31% compared with the method without model acceleration and the learned skill can be transferred to new tasks to accelerate the training for new policies.

中文翻译:

用于加速机器人装配的模型加速强化学习

间隙狭窄的孔内钉组装是工业制造中典型的机器人接触丰富的任务。机器人学习使机器人可以直接获得此任务的组装技能,而无需建模和识别复杂的接触状态。但是,由于很难收集大量的转换数据并将技能转移到新任务上,因此学习此类技能对于机器人仍然具有挑战性,这不可避免地导致培训效率低下。本文将装配任务定义为马尔可夫决策过程,并提出了一种模型加速强化学习方法,以有效地学习装配策略。在这种方法中,使用最大熵强化学习框架学习组装策略,并使用阻抗控制器执行该策略,这确保了探索效率,同时允许在任务之间传递技能。为了降低样本的复杂度并提高训练效率,该方法在训练策略的同时利用高斯过程学习了环境动力学,然后利用学习到的动态模型来改进目标值估计,并生成虚拟数据作为参数转换样本。该方法可以在最大程度地减少实际交互需求的同时,稳健地学习组装技能,从而使其适合实际组装场景。为了验证所提出的方法,在工业机器人上进行了实验,
更新日期:2020-06-02
down
wechat
bug