Path planning of mobile robot in unknown dynamic continuous environment using reward-modified deep Q-network,Optimal Control Applications and Methods

当前位置： X-MOL 学术 › Optim. Control Appl. Methods › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Path planning of mobile robot in unknown dynamic continuous environment using reward-modified deep Q-network
Optimal Control Applications and Methods ( IF 2.0 ) Pub Date : 2021-09-02 , DOI: 10.1002/oca.2781
Runnan Huang ₁ , Chengxuan Qin ₁ , Jian Ling Li ₁ , Xuejing Lan ₁

Affiliation

The path planning problem of mobile robot in unknown dynamic environment (UDE) is discussed in this article by building a continuous dynamic simulation environment. To achieve a collision-free path in UDE, the reinforcement learning theory with deep Q-network (DQN) is applied for the mobile robot to learn optimal decisions. A reward function is designed with weight to balance the obstacle avoidance and the approach to the goal. Moreover, it is found that the relative motion between moving obstacles and robots may cause abnormal rewards and further lead to a collision between robot and obstacle. To address this problem, two reward thresholds are set to modify the abnormal rewards, and the experiments shows that the robot can avoid all obstacles and reach the goal successfully. Finally, double DQN (DDQN) and dueling DQN are applied in this article. This article compares the results of reward-modified DQN (RMDQN), reward-modified DDQN (RMDDQN), dueling RMDQN, and dueling RMDDQN and concludes that the result of RMDDQN is the best.

中文翻译：

使用奖励修正的深度 Q 网络在未知动态连续环境中移动机器人的路径规划

本文通过搭建连续动态仿真环境，讨论移动机器人在未知动态环境（UDE）中的路径规划问题。为了在 UDE 中实现无碰撞路径，将深度 Q 网络 (DQN) 的强化学习理论应用于移动机器人以学习最佳决策。设计了一个带有权重的奖励函数，以平衡避障和接近目标。此外，发现移动障碍物与机器人之间的相对运动可能会导致异常奖励，并进一步导致机器人与障碍物发生碰撞。针对该问题，设置了两个奖励阈值来修改异常奖励，实验表明机器人可以避开所有障碍物并成功到达目标。最后，本文应用了双DQN（DDQN）和决斗DQN。

更新日期：2021-09-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文