当前位置: X-MOL 学术J. Field Robot. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autonomous boat driving system using sample‐efficient model predictive control‐based reinforcement learning approach
Journal of Field Robotics ( IF 8.3 ) Pub Date : 2020-09-25 , DOI: 10.1002/rob.21990
Yunduan Cui 1, 2, 3 , Shigeki Osaki 4 , Takamitsu Matsubara 1
Affiliation  

In this article, we propose a novel reinforcement learning (RL) approach specialized for autonomous boats: sample‐efficient probabilistic model predictive control (SPMPC), to iteratively learn control policies of boats in real ocean environments without human prior knowledge. SPMPC addresses difficulties arising from large uncertainties in this challenging application and the need for rapid adaptation to dynamic environmental conditions, and the extremely high cost of exploring and sampling with a real vessel. SPMPC combines a Gaussian process model and model predictive control under a model‐based RL framework to iteratively model and quickly respond to uncertain ocean environments while maintaining sample efficiency. A SPMPC system is developed with features including quadrant‐based action search rule, bias compensation, and parallel computing that contribute to better control capabilities. It successfully learns to control a full‐sized single‐engine boat equipped with sensors measuring GPS position, speed, direction, and wind, in a real‐world position holding task without models from human demonstration.

中文翻译:

使用基于样本有效模型预测控制的强化学习方法的自主船只驾驶系统

在本文中,我们提出了一种专门针对自主船的新型强化学习(RL)方法:样本有效概率模型预测控制(SPMPC),以迭代地学习在没有人类先验知识的情况下实际海洋环境中的船的控制策略。SPMPC解决了在此具有挑战性的应用中存在的巨大不确定性以及快速适应动态环境条件的需求以及使用实际船只进行勘探和采样的极高成本所带来的困难。SPMPC在基于模型的RL框架下结合了高斯过程模型和模型预测控制,以进行迭代建模并快速响应不确定的海洋环境,同时保持样本效率。开发了SPMPC系统,其功能包括基于象限的动作搜索规则,偏差补偿,和并行计算有助于更好的控制能力。它可以成功学习如何在现实世界中的位置保持任务中控制配备了可测量GPS位置,速度,方向和风的传感器的全尺寸单引擎船,而无需人工演示模型。
更新日期:2020-09-25
down
wechat
bug