Reinforcement learning-based collision-free path planner for redundant robot in narrow duct,Journal of Intelligent Manufacturing

当前位置： X-MOL 学术 › J. Intell. Manuf. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reinforcement learning-based collision-free path planner for redundant robot in narrow duct
Journal of Intelligent Manufacturing ( IF 8.3 ) Pub Date : 2020-05-27 , DOI: 10.1007/s10845-020-01582-1
Xiaotong Hua , Guolei Wang , Jing Xu , Ken Chen

Compared with obstacle avoidance in open environment, collision-free path planning for duct-enter task is often challenged by narrow and complex space inside ducts. For obstacle avoidance, redundant robot is usually applied for this task. The motion of redundant robot can be decoupled to end-effector motion and self-motion. Current methods for duct-enter task are not robust due to the difficulty to properly define the self-motion. This difficulty mainly comes from two aspects: the definition of distances from robot to obstacles and the fusion of multiple data. In this work, we adapt the ideas underlying the success of human to handling this kind tasks, variable optimization strategies and learning, for one robust path planner. Proposed planner applies reinforcement learning skills to learn proper self-motion and achieves robust planning. For achieving robust behavior, state-action planner is creatively designed with three especially designed strategies. Firstly, optimization function, the kernel part of self-motion, is considered as part of action. Instead of taking every joint motion, this strategy embeds reinforcement learning skills on self-motion, reducing the search domain to null space of redundant robot. Secondly, robot end orientation is taken into action. For duct-enter task, robot end link is the motion starter for exploring movement just like the snake head. The orientation of robot end link when passing through some position can be referred by following links. Hence the second strategy can accelerate exploring by reduce the null space to possible redundant robot manifold. Thirdly, path guide point is also added into action part. This strategy can divide one long distance task into several short distance tasks, reducing the task difficulty. After these creative designs, the planner has been trained with reinforcement learning skills. With the feedback of robot and environment state, proposed planner can choose proper optimization strategies, just like the human brain, for avoiding collision between robot body and target duct. Compared with two general methods, Virtual Axis method with orientation Guidance and Virtual Axis, experiment results show that the success rate is separately improved by 5.9% and 49.7%. And two different situation experiments are carried out on proposed planner. Proposed planner achieves 100% success rate in the situation with constant start point and achieves 98.7% success rate in the situation with random start point meaning that the proposed planner can handle the perturbation of start point and goal point. The experiments proves the robustness of proposed planner.

中文翻译：

基于增强学习的窄管冗余机器人无碰撞路径规划器

与在开放环境中避开障碍物相比，风管进入任务的无碰撞路径规划常常受到风管内部狭窄而复杂的空间的挑战。为了避免障碍物，通常将冗余机器人用于此任务。冗余机器人的运动可以与末端执行器运动和自运动分离。由于难以正确定义自运动，当前用于导管输入任务的方法并不可靠。这个困难主要来自两个方面：机器人到障碍物的距离的定义以及多个数据的融合。在这项工作中，我们为一个健壮的路径规划者调整了人类成功的基本思想，以处理此类任务，变量优化策略和学习。拟议的计划者运用强化学习技能来学习适当的自我运动并实现可靠的计划。为了实现强大的行为，国家行动计划者采用三种特别设计的策略进行了创造性的设计。首先，优化功能是自我运动的核心部分，被视为行动的一部分。该策略不是采取任何联合动作，而是将强化学习技能嵌入到自身动作中，从而将搜索域减少到冗余机器人的零空间。其次，采取机器人末端定向的方法。对于风管输入任务，机器人末端连杆是运动启动器，用于像蛇头一样探索运动。机器人末端连杆经过某个位置时的方向可通过以下连杆进行参考。因此，第二种策略可以通过减少可能的冗余机器人歧管的零空间来加速探索。第三，路径引导点也被添加到动作部分。这种策略可以将一个长途任务分为几个短途任务，从而降低了任务难度。经过这些创造性的设计之后，计划者已经接受了强化学习技能的培训。借助机器人和环境状态的反馈，拟议的计划者可以选择适当的优化策略，就像人的大脑一样，以避免机器人身体与目标管道之间的碰撞。实验结果表明，与定向引导虚拟轴方法和虚拟轴这两种通用方法相比，成功率分别提高了5.9％和49.7％。在拟议的计划程序上进行了两种不同的情况实验。拟议的计划者在恒定起点下达到100％的成功率并达到98。在随机起点的情况下，成功率为7％，这意味着建议的计划者可以处理起点和目标的扰动。实验证明了所提出计划器的鲁棒性。

更新日期：2020-05-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>