当前位置: X-MOL 学术Neural Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.
Neural Networks ( IF 7.8 ) Pub Date : 2020-06-16 , DOI: 10.1016/j.neunet.2020.05.029
Zhenshan Bing 1 , Christian Lemke 2 , Long Cheng 3 , Kai Huang 4 , Alois Knoll 1
Affiliation  

Similar to real snakes in nature, the flexible trunks of snake-like robots enhance their movement capabilities and adaptabilities in diverse environments. However, this flexibility corresponds to a complex control task involving highly redundant degrees of freedom, where traditional model-based methods usually fail to propel the robots energy-efficiently and adaptively to unforeseeable joint damage. In this work, we present an approach for designing an energy-efficient and damage-recovery slithering gait for a snake-like robot using the reinforcement learning (RL) algorithm and the inverse reinforcement learning (IRL) algorithm. Specifically, we first present an RL-based controller for generating locomotion gaits at a wide range of velocities, which is trained using the proximal policy optimization (PPO) algorithm. Then, by taking the RL-based controller as an expert and collecting trajectories from it, we train an IRL-based controller using the adversarial inverse reinforcement learning (AIRL) algorithm. For the purpose of comparison, a traditional parameterized gait controller is presented as the baseline and the parameter sets are optimized using the grid search and Bayesian optimization algorithm. Based on the analysis of the simulation results, we first demonstrate that this RL-based controller exhibits very natural and adaptive movements, which are also substantially more energy-efficient than the gaits generated by the parameterized controller. We then demonstrate that the IRL-based controller cannot only exhibit similar performances as the RL-based controller, but can also recover from the unpredictable damage body joints and still outperform the model-based controller, which has an undamaged body, in terms of energy efficiency. Videos can be viewed at https://videoviewsite.wixsite.com/rlsnake.



中文翻译:

基于强化学习和逆强化学习的蛇形机器人高效节能且具有恢复性的滑行步态设计。

类似于自然界中的真实蛇,蛇状机器人的灵活躯干增强了它们在各种环境中的移动能力和适应性。但是,这种灵活性对应于涉及高度冗余自由度的复杂控制任务,其中传统的基于模型的方法通常无法有效地推动机器人适应性地应对不可预见的关节损坏。在这项工作中,我们提出一种使用强化学习(RL)算法和逆强化学习(IRL)算法为蛇形机器人设计节能高效且恢复损伤的步态的方法。具体来说,我们首先介绍一种基于RL的控制器,用于在较宽的速度范围内生成运动步态,并使用近端策略优化(PPO)算法对其进行训练。然后,通过将基于RL的控制器作为专家并从中收集轨迹,我们使用对抗性逆强化学习(AIRL)算法训练了基于IRL的控制器。为了进行比较,将传统的参数化步态控制器作为基线,并使用网格搜索和贝叶斯优化算法对参数集进行了优化。基于对仿真结果的分析,我们首先证明了这种基于RL的控制器表现出非常自然和自适应的运动,比参数化控制器所产生的步态还具有更高的能源效率。然后,我们证明基于IRL的控制器不能仅表现出与基于RL的控制器类似的性能,但也可以从无法预测的损坏的车身关节中恢复过来,并且在能效方面仍然优于基于模型的控制器,该控制器的车身没有损坏。可以在https://videoviewsite.wixsite.com/rlsnake上观看视频。

更新日期:2020-06-25
down
wechat
bug