当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 2021-09-13 , DOI: 10.1109/tcyb.2021.3107801
Mingming Ha 1 , Ding Wang 2 , Derong Liu 3
Affiliation  

This article is concerned with the stability of the closed-loop system using various control policies generated by value iteration. Some stability properties involving admissibility criteria, the attraction domain, and so forth, are investigated. An offline integrated value iteration (VI) scheme with a stability guarantee is developed by combining the advantages of VI and policy iteration, which is convenient to obtain admissible control policies. Also, based on the concept of attraction domain, an online adaptive dynamic programming algorithm using immature control policies is developed. Remarkably, it is ensured that the state trajectory under the online algorithm converges to the origin. Particularly, for linear systems, the online ADP algorithm with a general scheme possesses more enhanced stability property. The theoretical results reveal that the stability of the linear system can be guaranteed even if the control policy sequence includes finite unstable elements. The numerical results verify the effectiveness of the present algorithms.

中文翻译:


通过价值迭代保证稳定性的离线和在线自适应批评控制设计



本文关注的是使用值迭代生成的各种控制策略的闭环系统的稳定性。研究了一些涉及容许标准、吸引域等的稳定性特性。结合VI和策略迭代的优点,提出了一种具有稳定性保证的离线综合值迭代(VI)方案,便于获取可接受的控制策略。此外,基于吸引域的概念,开发了一种使用不成熟控制策略的在线自适应动态规划算法。值得注意的是,保证了在线算法下的状态轨迹收敛到原点。特别是对于线性系统,具有通用方案的在线ADP算法具有更强的稳定性。理论结果表明,即使控制策略序列包含有限个不稳定元素,也能保证线性系统的稳定性。数值结果验证了本文算法的有效性。
更新日期:2021-09-13
down
wechat
bug