当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control
Neurocomputing ( IF 5.5 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.neucom.2020.05.089
Shuo Xie , Xiumin Chu , Mao Zheng , Chenguang Liu

Abstract Model-free reinforcement learning methods have potentials in ship collision avoidance under unknown environments. To defect the low efficiency problem of the model-free reinforcement learning, a composite learning method is proposed based on an asynchronous advantage actor-critic (A3C) algorithm, a long short-term memory neural network (LSTM) and Q-learning. The proposed method uses Q-learning for adaptive decisions between a LSTM inverse model-based controller and the model-free A3C policy. Multi-ship collision avoidance simulations are conducted to verify the effectiveness of the model-free A3C method, the proposed inverse model-based method and the composite learning method. The simulation results indicate that the proposed composite learning based ship collision avoidance method outperforms the A3C learning method and a traditional optimization-based method.

中文翻译:

一种基于强化学习和逆控制的多船避碰复合学习方法

摘要 无模型强化学习方法在未知环境下的船舶避碰方面具有潜力。针对无模型强化学习效率低的问题,提出了一种基于异步优势actor-critic(A3C)算法、长短期记忆神经网络(LSTM)和Q-learning的复合学习方法。所提出的方法使用 Q-learning 在基于 LSTM 逆模型的控制器和无模型 A3C 策略之间进行自适应决策。进行多船避碰仿真以验证无模型A3C方法、所提出的基于逆模型的方法和复合学习方法的有效性。
更新日期:2020-10-01
down
wechat
bug