Linear quadratic tracking control of unknown systems: A two-phase reinforcement learning method,Automatica

当前位置： X-MOL 学术 › Automatica › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Linear quadratic tracking control of unknown systems: A two-phase reinforcement learning method
Automatica ( IF 4.8 ) Pub Date : 2022-11-29 , DOI: 10.1016/j.automatica.2022.110761
Jianguo Zhao , Chunyu Yang , Weinan Gao , Hamidreza Modares , Xinkai Chen , Wei Dai

This paper considers the problem of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. The existing design methods often require the discount factor to be small enough to guarantee the closed-loop stability. However, solving the discounted algebraic Riccati equation (ARE) may lead to ill-conditioned numerical issues if the discount factor is too small. By singular perturbation theory, we decompose the full-order discounted ARE into a reduced-order ARE and a Sylvester equation, which facilitate designing the feedback and feedforward control gains. The obtained controller is proved to be a stabilizing and near-optimal solution to the original LQTC problem. In the framework of reinforcement learning, both on-policy and off-policy two-phase learning algorithms are derived to design the near-optimal tracking control policy without knowing the discount factor. The advantages of the developed results are illustrated by comparative simulation results.

中文翻译：

未知系统的线性二次跟踪控制：一种两阶段强化学习方法

本文考虑了未知系统的具有折扣成本函数的线性二次跟踪控制 (LQTC) 问题。现有的设计方法往往要求折扣因子足够小以保证闭环稳定性。但是，如果折扣因子太小，求解折扣代数 Riccati 方程 (ARE) 可能会导致病态数值问题。通过奇异微扰理论，我们将全阶贴现 ARE 分解为降阶 ARE 和西尔维斯特方程，这有助于设计反馈和前馈控制增益。所获得的控制器被证明是对原始 LQTC 问题的稳定且接近最优的解决方案。在强化学习的框架下，推导了 on-policy 和 off-policy 两阶段学习算法，以在不知道折扣因子的情况下设计接近最优的跟踪控制策略。对比仿真结果说明了所开发结果的优点。

更新日期：2022-11-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11