Learning Individually Inferred Communication for Multi-Agent Cooperation,arXiv - CS - Multiagent Systems

当前位置： X-MOL 学术 › arXiv.cs.MA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Individually Inferred Communication for Multi-Agent Cooperation
arXiv - CS - Multiagent Systems Pub Date : 2020-06-11 , DOI: arxiv-2006.06455
Ziluo Ding, Tiejun Huang and Zongqing Lu

Communication lays the foundation for human cooperation. It is also crucial for multi-agent cooperation. However, existing work focuses on broadcast communication, which is not only impractical but also leads to information redundancy that could even impair the learning process. To tackle these difficulties, we propose \textit{Individually Inferred Communication} (I2C), a simple yet effective model to enable agents to learn a prior for agent-agent communication. The prior knowledge is learned via causal inference and realized by a feed-forward neural network that maps the agent's local observation to a belief about who to communicate with. The influence of one agent on another is inferred via the joint action-value function in multi-agent reinforcement learning and quantified to label the necessity of agent-agent communication. Furthermore, the agent policy is regularized to better exploit communicated messages. Empirically, we show that I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios, comparing to existing methods.

中文翻译：

为多智能体合作学习单独推断的通信

交流为人类合作奠定了基础。这对于多代理合作也至关重要。然而，现有的工作侧重于广播通信，这不仅不切实际，而且会导致信息冗余，甚至可能损害学习过程。为了解决这些困难，我们提出了 \textit{Individually Inferred Communication} (I2C)，这是一种简单而有效的模型，使代理能够学习代理与代理通信的先验。先验知识是通过因果推理学习的，并通过前馈神经网络实现，该网络将代理的局部观察映射到关于与谁通信的信念。通过多智能体强化学习中的联合动作-价值函数推断一个智能体对另一个智能体的影响，并量化以标记智能体-智能体通信的必要性。此外，代理策略被规范化以更好地利用通信的消息。根据经验，我们表明，与现有方法相比，I2C 不仅可以减少通信开销，还可以提高各种多代理协作场景中的性能。

更新日期：2020-06-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文