当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2022-11-30 , DOI: 10.1109/tnnls.2022.3214681
Lin Chen 1 , Shi-Lu Dai 2 , Chao Dong 3
Affiliation  

In this article, we present an adaptive reinforcement learning optimal tracking control (RLOTC) algorithm for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances. By integrating backstepping technique with the optimized control design, we show that the desired optimal tracking performance of vessel control is guaranteed due to the fact that the virtual and actual control inputs are designed as optimized solutions of every subsystem. To enhance the robustness of vessel control systems, we employ neural network (NN) approximators to approximate uncertain vessel dynamics and present adaptive control technique to estimate the upper boundedness of external disturbances. Under the reinforcement learning framework, we construct actor-critic networks to solve the Hamilton-Jacobi-Bellman equations corresponding to subsystems of surface vessel to achieve the optimized control. The optimized control algorithm can synchronously train the adaptive parameters not only for actor-critic networks but also for NN approximators and adaptive control. By Lyapunov stability theorem, we show that the RLOTC algorithm can ensure the semiglobal uniform ultimate boundedness of the closed-loop systems. Compared with the existing reinforcement learning control results, the presented RLOTC algorithm can compensate for uncertain vessel dynamics and unknown disturbances, and obtain the optimized control performance by considering optimization in every backstepping design. Simulation studies on an underactuated surface vessel are given to illustrate the effectiveness of the RLOTC algorithm.

中文翻译:

使用 Actor-Critic 强化学习的欠驱动水面船只的自适应最优跟踪控制。

在本文中,我们提出了一种适用于受建模不确定性和时变外部干扰影响的欠驱动水面舰船的自适应强化学习最优跟踪控制 (RLOTC) 算法。通过将反推技术与优化控制设计相结合,我们表明,由于虚拟和实际控制输入被设计为每个子系统的优化解决方案,因此可以保证船舶控制所需的最佳跟踪性能。为了增强船舶控制系统的鲁棒性,我们采用神经网络 (NN) 逼近器来逼近不确定的船舶动力学,并提出自适应控制技术来估计外部干扰的上限。在强化学习框架下,我们构建了演员-评论家网络来求解对应于水面舰艇子系统的Hamilton-Jacobi-Bellman方程,以实现优化控制。优化的控制算法可以同步训练自适应参数,不仅适用于 actor-critic 网络,还适用于 NN 逼近器和自适应控制。由Lyapunov稳定性定理证明了RLOTC算法可以保证闭环系统的半全局一致极限有界性。与现有的强化学习控制结果相比,所提出的 RLOTC 算法可以补偿不确定的船舶动力学和未知干扰,并通过在每个反推设计中考虑优化来获得优化的控制性能。
更新日期:2022-11-30
down
wechat
bug