当前位置: X-MOL 学术Front. Phys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automated Discovery of Local Rules for Desired Collective-Level Behavior Through Reinforcement Learning
Frontiers in Physics ( IF 3.1 ) Pub Date : 2020-05-05 , DOI: 10.3389/fphy.2020.00200
Tiago Costa , Andres Laan , Francisco J. H. Heras , Gonzalo G. de Polavieja

Complex global behavior patterns can emerge from very simple local interactions between many agents. However, no local interaction rules have been identified that generate some patterns observed in nature, for example the rotating balls, rotating tornadoes and the full-core rotating mills observed in fish collectives. Here we show that locally interacting agents modeled with a minimal cognitive system can produce these collective patterns. We obtained this result by using recent advances in reinforcement learning to systematically solve the inverse modeling problem: given an observed collective behavior, we automatically find a policy generating it. Our agents are modeled as processing the information from neighbor agents to choose actions with a neural network and move in an environment of simulated physics. Even though every agent is equipped with its own neural network, all agents have the same network architecture and parameter values, ensuring in this way that a single policy is responsible for the emergence of a given pattern. We find the final policies by tuning the neural network weights until the produced collective behavior approaches the desired one. By using modular neural networks with modules using a small number of inputs and outputs, we built an interpretable model of collective motion. This enabled us to analyse the policies obtained. We found a similar general structure for the four different collective patterns, not dissimilar to the one we have previously inferred from experimental zebrafish trajectories; but we also found consistent differences between policies generating the different collective pattern, for example repulsion in the vertical direction for the more three-dimensional structures of the sphere and tornado. Our results illustrate how new advances in artificial intelligence, and specifically in reinforcement learning, allow new approaches to analysis and modeling of collective behavior.



中文翻译:

通过强化学习自动发现所需集体行为的本地规则

复杂的全局行为模式可能来自许多代理之间非常简单的本地交互。但是,尚未发现产生自然界中观察到的某些模式的局部相互作用规则,例如在鱼群中观察到的旋转球,旋转龙卷风和全芯旋转磨。在这里,我们显示以最小认知系统建模的本地交互代理可以产生这些集体模式。我们通过使用强化学习的最新进展来系统地解决逆建模问题,从而获得了这一结果:给定观察到的集体行为,我们会自动找到一个生成它的策略。我们的代理被建模为处理来自邻居代理的信息,以使用神经网络选择动作并在模拟物理环境中移动。即使每个代理都配备了自己的神经网络,所有代理都具有相同的网络体系结构和参数值,从而确保以单个策略负责给定模式的出现。我们通过调整神经网络权重来找到最终策略,直到产生的集体行为接近所需的行为为止。通过将模块化神经网络与使用少量输入和输出的模块一起使用,我们建立了可解释的集体运动模型。这使我们能够分析获得的政策。我们为四种不同的集体模式找到了相似的总体结构,这与我们先前从实验斑马鱼的轨迹中推断出的模式没有什么不同。但是我们还发现,产生不同集体模式的政策之间存在一致的差异,例如,球和龙卷风的更多三维结构在垂直方向上的排斥力。我们的结果说明了人工智能(特别是强化学习)方面的新进展如何为集体行为的分析和建模提供了新方法。

更新日期:2020-06-25
down
wechat
bug