当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Multigradient Recursive Reinforcement Learning Event-Triggered Tracking Control for Multiagent Systems
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2021-07-01 , DOI: 10.1109/tnnls.2021.3090570
Hongyi Li 1 , Ying Wu 2 , Mou Chen 3 , Renquan Lu 1
Affiliation  

This article proposes a fault-tolerant adaptive multigradient recursive reinforcement learning (RL) event-triggered tracking control scheme for strict-feedback discrete-time multiagent systems. The multigradient recursive RL algorithm is used to avoid the local optimal problem that may exist in the gradient descent scheme. Different from the existing event-triggered control results, a new lemma about the relative threshold event-triggered control strategy is proposed to handle the compensation error, which can improve the utilization of communication resources and weaken the negative impact on tracking accuracy and closed-loop system stability. To overcome the difficulty caused by sensor fault, a distributed control method is introduced by adopting the adaptive compensation technique, which can effectively decrease the number of online estimation parameters. Furthermore, by using the multigradient recursive RL algorithm with less learning parameters, the online estimation time can be effectively reduced. The stability of closed-loop multiagent systems is proved by using the Lyapunov stability theorem, and it is verified that all signals are semiglobally uniformly ultimately bounded. Finally, two simulation examples are given to show the availability of the presented control scheme.

中文翻译:

多智能体系统的自适应多梯度递归强化学习事件触发跟踪控制

本文提出了一种适用于严格反馈离散时间多智能体系统的容错自适应多梯度递归强化学习 (RL) 事件触发跟踪控制方案。多梯度递归RL算法用于避免梯度下降方案中可能存在的局部最优问题。不同于现有的事件触发控制结果,提出了一种新的关于相对阈值事件触发控制策略的引理来处理补偿误差,可以提高通信资源的利用率,减弱对跟踪精度和闭环的负面影响系统稳定性。为了克服传感器故障造成的困难,采用自适应补偿技术引入分布式控制方法,可以有效减少在线估计参数的数量。此外,通过使用学习参数较少的多梯度递归RL算法,可以有效减少在线估计时间。利用Lyapunov稳定性定理证明了闭环多智能体系统的稳定性,并验证了所有信号半全局一致最终有界。最后,给出了两个仿真例子来说明所提出的控制方案的可用性。
更新日期:2021-07-01
down
wechat
bug