Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems,Automatica

当前位置： X-MOL 学术 › Automatica › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems
Automatica ( IF 6.4 ) Pub Date : 2020-06-24 , DOI: 10.1016/j.automatica.2020.109081
Ci Chen , Frank L. Lewis , Kan Xie , Shengli Xie , Yilu Liu

This paper proposes an off-policy learning-based dynamic state feedback protocol that achieves the optimal synchronization of heterogeneous multi-agent systems (MAS) over a directed communication network. Note that most of the recent works on heterogeneous MAS are not formed in an optimal manner. By formulating the cooperative output regulation problem as an $H_{\infty}$ optimization problem, we can use reinforcement learning to find output synchronization protocols online along with the system trajectories without solving output regulator equations. In contrast to the existing optimal literature where leader’s states are assumed to be globally or distributively available for the communication, we only allow the relative system outputs to transmit through the network; namely, no leader’s states are needed now for the control or learning purpose.

中文翻译：

非策略学习，用于异构多Agent系统的自适应最优输出同步

本文提出了一种基于非策略学习的动态状态反馈协议，该协议可在定向通信网络上实现异构多智能体系统（MAS）的最佳同步。注意，最近关于异构MAS的大多数工作并不是以最佳方式形成的。通过将合作产出调节问题表述为 $H_{\infty}$ 在优化问题上，我们可以使用强化学习在线找到输出同步协议以及系统轨迹，而无需求解输出调节器方程。与假定领导者的状态在全球范围内或分布于通信中的现有最佳文献相反，我们仅允许相关系统输出通过网络进行传输；即，现在不需要出于控制或学习目的的领导者状态。

更新日期：2020-06-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>