当前位置: X-MOL 学术Swarm Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning in a continuum of agents
Swarm Intelligence ( IF 2.6 ) Pub Date : 2017-10-13 , DOI: 10.1007/s11721-017-0142-9
Adrian Šošić , Abdelhak M. Zoubir , Heinz Koeppl

We present a decision-making framework for modeling the collective behavior of large groups of cooperatively interacting agents based on a continuum description of the agents’ joint state. The continuum model is derived from an agent-based system of locally coupled stochastic differential equations, taking into account that each agent in the group is only partially informed about the global system state. The usefulness of the proposed framework is twofold: (i) for multi-agent scenarios, it provides a computational approach to handling large-scale distributed decision-making problems and learning decentralized control policies. (ii) For single-agent systems, it offers an alternative approximation scheme for evaluating expectations of state distributions. We demonstrate our framework on a variant of the Kuramoto model using a variety of distributed control tasks, such as positioning and aggregation. As part of our experiments, we compare the effectiveness of the controllers learned by the continuum model and agent-based systems of different sizes, and we analyze how the degree of observability in the system affects the learning process.

中文翻译:

在连续的特工中进行强化学习

我们提供了一个决策框架,用于基于对代理人联合状态的连续描述,对大批协作交互代理人的集体行为进行建模。连续模型是从基于代理的局部耦合随机微分方程的系统中得出的,考虑到组中的每个代理仅部分了解全局系统状态。所提出的框架的用途是双重的:(i)对于多主体场景,它提供了一种处理大规模分布式决策问题和学习分散控制策略的计算方法。(ii)对于单代理系统,它提供了一种替代的近似方案,用于评估状态分布的期望。我们使用各种分布式控制任务(例如定位和聚合)在Kuramoto模型的变体上展示了我们的框架。作为我们实验的一部分,我们比较了由连续模型和不同规模的基于代理的系统学习的控制器的有效性,并分析了系统中可观察性的程度如何影响学习过程。
更新日期:2017-10-13
down
wechat
bug