当前位置: X-MOL 学术Appl. Ocean Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A path planning strategy unified with a COLREGS collision avoidance function based on deep reinforcement learning and artificial potential field
Applied Ocean Research ( IF 4.3 ) Pub Date : 2021-06-27 , DOI: 10.1016/j.apor.2021.102759
Lingyu Li , Defeng Wu , Youqiang Huang , Zhi-Ming Yuan

Improving the autopilot capability of ships is particularly important to ensure the safety of maritime navigation.The unmanned surface vessel (USV) with autopilot capability is a development trend of the ship of the future. The objective of this paper is to investigate the path planning problem of USVs in uncertain environments, and a path planning strategy unified with a collision avoidance function based on deep reinforcement learning (DRL) is proposed. A Deep Q-learning network (DQN) is used to continuously interact with the visually simulated environment to obtain experience data, so that the agent learns the best action strategies in the visual simulated environment. To solve the collision avoidance problems that may occur during USV navigation, the location of the obstacle ship is divided into four collision avoidance zones according to the International Regulations for Preventing Collisions at Sea (COLREGS). To obtain an improved DRL algorithm, the artificial potential field (APF) algorithm is utilized to improve the action space and reward function of the DQN algorithm. A simulation experiments is utilized to test the effects of our method in various situations. It is also shown that the enhanced DRL can effectively realize autonomous collision avoidance path planning.



中文翻译:

基于深度强化学习和人工势场的与COLREGS避碰函数统一的路径规划策略

提高船舶的自动驾驶能力对于保障海上航行安全尤为重要。具有自动驾驶能力的无人水面舰艇(USV)是未来船舶的发展趋势。本文的目的是研究不确定环境下USV的路径规划问题,并提出了一种基于深度强化学习(DRL)的与避撞功能相统一的路径规划策略。深度Q学习网络(DQN)用于与视觉模拟环境持续交互以获得经验数据,从而使代理在视觉模拟环境中学习最佳动作策略。为解决USV导航过程中可能出现的避碰问题,根据国际海上避碰规则(COLREGS),障碍船的位置被划分为四个避碰区。为了获得改进的DRL算法,利用人工势场(APF)算法来改进DQN算法的动作空间和奖励函数。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。

更新日期:2021-06-28
down
wechat
bug