Inverse reinforcement learning-based time-dependent A* planner for human-aware robot navigation with local vision,Advanced Robotics

当前位置： X-MOL 学术 › Adv. Robot. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inverse reinforcement learning-based time-dependent A* planner for human-aware robot navigation with local vision
Advanced Robotics ( IF 1.4 ) Pub Date : 2020-04-16
Shiying Sun, Xiaoguang Zhao, Qianzhong Li, Min Tan

In an environment where robots coexist with humans, mobile robots should be human-aware and comply with humans' behavioural norms so as to not disturb humans' personal space and activities. In this work, we propose an inverse reinforcement learning-based time-dependent A* planner for human-aware robot navigation with local vision. In this method, the planning process of time-dependent A* is regarded as a Markov decision process and the cost function of the time-dependent A* is learned using the inverse reinforcement learning via capturing humans' demonstration trajectories. With this method, a robot can plan a path that complies with humans' behaviour patterns and the robot's kinematics. When constructing feature vectors of the cost function, considering the local vision characteristics, we propose a visual coverage feature for enabling robots to learn from how humans move in a limited visual field. The effectiveness of the proposed method has been validated by experiments in real-world scenarios: using this approach robots can effectively mimic human motion patterns when avoiding pedestrians; furthermore, in a limited visual field, robots can learn to choose a path that enables them to have the larger visual coverage which shows a better navigation performance.

中文翻译：

基于逆向增强学习的基于时间的A *规划器，用于具有局部视觉的人类感知机器人导航

在机器人与人类共存的环境中，移动机器人应具有人类意识并遵守人类的行为规范，以免干扰人类的个人空间和活动。在这项工作中，我们提出了一种基于逆增强学习的基于时间的A *规划器，用于具有局部视觉的人类感知机器人导航。在这种方法中，将时间相关的A *的规划过程视为马尔可夫决策过程，并通过逆强化学习，通过捕获人类的演示轨迹来学习时间相关的A *的成本函数。通过这种方法，机器人可以规划符合人类行为模式和机器人运动学的路径。在构建成本函数的特征向量时，请考虑局部视觉特征，我们提出了一种视觉覆盖功能，以使机器人能够从人类在有限视野中的运动中学习。该方法的有效性已通过实际场景中的实验得到验证：使用这种方法，机器人可以在避开行人时有效模仿人类的运动模式；此外，在有限的视野中，机器人可以学会选择一条使其具有更大视觉覆盖范围的路径，从而显示出更好的导航性能。

更新日期：2020-04-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11