当前位置: X-MOL 学术Int. J. Robust Nonlinear Control › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust optimal tracking control for multiplayer systems by off‐policy Q‐learning approach
International Journal of Robust and Nonlinear Control ( IF 3.2 ) Pub Date : 2020-10-14 , DOI: 10.1002/rnc.5263
Jinna Li 1 , Zhenfei Xiao 1 , Ping Li 1 , Jiangtao Cao 1
Affiliation  

In this article, a novel off‐policy cooperative game Q‐learning algorithm is proposed for achieving optimal tracking control of linear discrete‐time multiplayer systems suffering from exogenous dynamic disturbance. The key strategy, for the first time, is to integrate reinforcement learning, cooperative games with output regulation under the discrete‐time sampling framework for achieving data‐driven optimal tracking control and disturbance rejection. Without the information of state and input matrices of multiplayer systems, as well as the dynamics of exogenous disturbance and command generator, the coordination equilibrium solution and the steady‐state control laws are learned using data by a novel off‐policy Q‐learning approach, such that multiplayer systems have the capability of tolerating disturbance and follow the reference signal via the optimal approach. Moreover, the rigorous theoretical proofs of unbiasedness of coordination equilibrium solution and convergence of the proposed algorithm are presented. Simulation results are given to show the efficacy of the developed approach.

中文翻译:

通过非政策性Q学习方法为多人系统提供强大的最佳跟踪控制

在本文中,提出了一种新颖的非策略合作博弈Q学习算法,用于实现遭受外源动态干扰的线性离散多人系统的最优跟踪控制。关键策略首次是在离散时间采样框架下将强化学习,合作博弈与输出调节相集成,以实现数据驱动的最佳跟踪控制和干扰抑制。在没有多人系统状态和输入矩阵信息以及外部干扰和命令生成器动力学的情况下,通过新颖的非政策性Q学习方法使用数据学习了协调平衡解和稳态控制律,这样,多人游戏系统就具有容忍干扰的能力,并可以通过最佳方法跟踪参考信号。此外,给出了协调平衡解无偏性和所提出算法收敛性的严格理论证明。仿真结果表明了该方法的有效性。
更新日期:2020-12-04
down
wechat
bug