Applied Ocean Research ( IF 4.3 ) Pub Date : 2021-06-27 , DOI: 10.1016/j.apor.2021.102759 Lingyu Li , Defeng Wu , Youqiang Huang , Zhi-Ming Yuan
Improving the autopilot capability of ships is particularly important to ensure the safety of maritime navigation.The unmanned surface vessel (USV) with autopilot capability is a development trend of the ship of the future. The objective of this paper is to investigate the path planning problem of USVs in uncertain environments, and a path planning strategy unified with a collision avoidance function based on deep reinforcement learning (DRL) is proposed. A Deep Q-learning network (DQN) is used to continuously interact with the visually simulated environment to obtain experience data, so that the agent learns the best action strategies in the visual simulated environment. To solve the collision avoidance problems that may occur during USV navigation, the location of the obstacle ship is divided into four collision avoidance zones according to the International Regulations for Preventing Collisions at Sea (COLREGS). To obtain an improved DRL algorithm, the artificial potential field (APF) algorithm is utilized to improve the action space and reward function of the DQN algorithm. A simulation experiments is utilized to test the effects of our method in various situations. It is also shown that the enhanced DRL can effectively realize autonomous collision avoidance path planning.
中文翻译:
基于深度强化学习和人工势场的与COLREGS避碰函数统一的路径规划策略
提高船舶的自动驾驶能力对于保障海上航行安全尤为重要。具有自动驾驶能力的无人水面舰艇(USV)是未来船舶的发展趋势。本文的目的是研究不确定环境下USV的路径规划问题,并提出了一种基于深度强化学习(DRL)的与避撞功能相统一的路径规划策略。深度Q学习网络(DQN)用于与视觉模拟环境持续交互以获得经验数据,从而使代理在视觉模拟环境中学习最佳动作策略。为解决USV导航过程中可能出现的避碰问题,根据国际海上避碰规则(COLREGS),障碍船的位置被划分为四个避碰区。为了获得改进的DRL算法,利用人工势场(APF)算法来改进DQN算法的动作空间和奖励函数。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。利用模拟实验来测试我们的方法在各种情况下的效果。还表明,增强的 DRL 可以有效地实现自主避碰路径规划。