A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control,Neurocomputing

当前位置： X-MOL 学术 › Neurocomputing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control
Neurocomputing ( IF 5.5 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.neucom.2020.05.089
Shuo Xie , Xiumin Chu , Mao Zheng , Chenguang Liu

Abstract Model-free reinforcement learning methods have potentials in ship collision avoidance under unknown environments. To defect the low efficiency problem of the model-free reinforcement learning, a composite learning method is proposed based on an asynchronous advantage actor-critic (A3C) algorithm, a long short-term memory neural network (LSTM) and Q-learning. The proposed method uses Q-learning for adaptive decisions between a LSTM inverse model-based controller and the model-free A3C policy. Multi-ship collision avoidance simulations are conducted to verify the effectiveness of the model-free A3C method, the proposed inverse model-based method and the composite learning method. The simulation results indicate that the proposed composite learning based ship collision avoidance method outperforms the A3C learning method and a traditional optimization-based method.

中文翻译：

一种基于强化学习和逆控制的多船避碰复合学习方法

摘要无模型强化学习方法在未知环境下的船舶避碰方面具有潜力。针对无模型强化学习效率低的问题，提出了一种基于异步优势actor-critic（A3C）算法、长短期记忆神经网络（LSTM）和Q-learning的复合学习方法。所提出的方法使用 Q-learning 在基于 LSTM 逆模型的控制器和无模型 A3C 策略之间进行自适应决策。进行多船避碰仿真以验证无模型A3C方法、所提出的基于逆模型的方法和复合学习方法的有效性。

更新日期：2020-10-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11