Distributed Algorithms for Linearly-Solvable Optimal Control in Networked Multi-Agent Systems,arXiv - CS - Multiagent Systems

当前位置： X-MOL 学术 › arXiv.cs.MA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Distributed Algorithms for Linearly-Solvable Optimal Control in Networked Multi-Agent Systems
arXiv - CS - Multiagent Systems Pub Date : 2021-02-18 , DOI: arxiv-2102.09104
Neng Wan, Aditya Gahlawat, Naira Hovakimyan, Evangelos A. Theodorou, Petros G. Voulgaris

Distributed algorithms for both discrete-time and continuous-time linearly solvable optimal control (LSOC) problems of networked multi-agent systems (MASs) are investigated in this paper. A distributed framework is proposed to partition the optimal control problem of a networked MAS into several local optimal control problems in factorial subsystems, such that each (central) agent behaves optimally to minimize the joint cost function of a subsystem that comprises a central agent and its neighboring agents, and the local control actions (policies) only rely on the knowledge of local observations. Under this framework, we not only preserve the correlations between neighboring agents, but moderate the communication and computational complexities by decentralizing the sampling and computational processes over the network. For discrete-time systems modeled by Markov decision processes, the joint Bellman equation of each subsystem is transformed into a system of linear equations and solved using parallel programming. For continuous-time systems modeled by It\^o diffusion processes, the joint optimality equation of each subsystem is converted into a linear partial differential equation, whose solution is approximated by a path integral formulation and a sample-efficient relative entropy policy search algorithm, respectively. The learned control policies are generalized to solve the unlearned tasks by resorting to the compositionality principle, and illustrative examples of cooperative UAV teams are provided to verify the effectiveness and advantages of these algorithms.

中文翻译：

网络化多智能体系统中线性可解最优控制的分布式算法

本文研究了用于网络多智能体系统（MAS）的离散时间和连续时间线性可解最优控制（LSOC）问题的分布式算法。提出了一种分布式框架，将网络化MAS的最优控制问题划分为阶乘子系统中的几个局部最优控制问题，以使每个（中央）代理的行为均达到最佳，从而最大程度地减少了包含中央代理及其子系统的子系统的联合成本函数。邻近的代理商，而当地的控制行动（政策）仅依赖于当地观察的知识。在这种框架下，我们不仅保留了相邻代理之间的相关性，而且通过分散网络上的采样和计算过程来缓和了通信和计算的复杂性。对于通过马尔可夫决策过程建模的离散时间系统，将每个子系统的联合Bellman方程转换为线性方程组，并使用并行编程进行求解。对于使用It \ o扩散过程建模的连续时间系统，将每个子系统的联合最优性方程转换为线性偏微分方程，通过路径积分公式和样本有效的相对熵策略搜索算法来近似求解该方程。分别。将学习到的控制策略归纳为组合原则，以解决未完成的任务，并提供了协作无人机团队的示例，以验证这些算法的有效性和优势。将每个子系统的联合Bellman方程转换为线性方程组，并使用并行编程进行求解。对于使用It \ o扩散过程建模的连续时间系统，将每个子系统的联合最优性方程转换为线性偏微分方程，通过路径积分公式和样本有效的相对熵策略搜索算法来近似求解该方程。分别。将学习到的控制策略归纳为组合原则，以解决未完成的任务，并提供了协作无人机团队的示例，以验证这些算法的有效性和优势。将每个子系统的联合Bellman方程转换为线性方程组，并使用并行编程进行求解。对于使用It \ o扩散过程建模的连续时间系统，将每个子系统的联合最优性方程转换为线性偏微分方程，通过路径积分公式和样本有效的相对熵策略搜索算法来近似求解该方程。分别。将学习到的控制策略归纳为组合原则，以解决未完成的任务，并提供了协作无人机团队的示例，以验证这些算法的有效性和优势。将每个子系统的联合最优性方程转化为线性偏微分方程，分别通过路径积分公式和样本有效的相对熵策略搜索算法对其解进行近似。将学习到的控制策略归纳为组合原则，以解决未完成的任务，并提供了协作无人机团队的示例，以验证这些算法的有效性和优势。将每个子系统的联合最优性方程转化为线性偏微分方程，分别通过路径积分公式和样本有效的相对熵策略搜索算法对其解进行近似。将学习到的控制策略归纳为组合原则，以解决未完成的任务，并提供了协作无人机团队的示例，以验证这些算法的有效性和优势。

更新日期：2021-02-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文