当前位置:
X-MOL 学术
›
arXiv.cs.MA
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Learning Individually Inferred Communication for Multi-Agent Cooperation
arXiv - CS - Multiagent Systems Pub Date : 2020-06-11 , DOI: arxiv-2006.06455 Ziluo Ding, Tiejun Huang and Zongqing Lu
arXiv - CS - Multiagent Systems Pub Date : 2020-06-11 , DOI: arxiv-2006.06455 Ziluo Ding, Tiejun Huang and Zongqing Lu
Communication lays the foundation for human cooperation. It is also crucial
for multi-agent cooperation. However, existing work focuses on broadcast
communication, which is not only impractical but also leads to information
redundancy that could even impair the learning process. To tackle these
difficulties, we propose \textit{Individually Inferred Communication} (I2C), a
simple yet effective model to enable agents to learn a prior for agent-agent
communication. The prior knowledge is learned via causal inference and realized
by a feed-forward neural network that maps the agent's local observation to a
belief about who to communicate with. The influence of one agent on another is
inferred via the joint action-value function in multi-agent reinforcement
learning and quantified to label the necessity of agent-agent communication.
Furthermore, the agent policy is regularized to better exploit communicated
messages. Empirically, we show that I2C can not only reduce communication
overhead but also improve the performance in a variety of multi-agent
cooperative scenarios, comparing to existing methods.
中文翻译:
为多智能体合作学习单独推断的通信
交流为人类合作奠定了基础。这对于多代理合作也至关重要。然而,现有的工作侧重于广播通信,这不仅不切实际,而且会导致信息冗余,甚至可能损害学习过程。为了解决这些困难,我们提出了 \textit{Individually Inferred Communication} (I2C),这是一种简单而有效的模型,使代理能够学习代理与代理通信的先验。先验知识是通过因果推理学习的,并通过前馈神经网络实现,该网络将代理的局部观察映射到关于与谁通信的信念。通过多智能体强化学习中的联合动作-价值函数推断一个智能体对另一个智能体的影响,并量化以标记智能体-智能体通信的必要性。此外,代理策略被规范化以更好地利用通信的消息。根据经验,我们表明,与现有方法相比,I2C 不仅可以减少通信开销,还可以提高各种多代理协作场景中的性能。
更新日期:2020-06-15
中文翻译:
为多智能体合作学习单独推断的通信
交流为人类合作奠定了基础。这对于多代理合作也至关重要。然而,现有的工作侧重于广播通信,这不仅不切实际,而且会导致信息冗余,甚至可能损害学习过程。为了解决这些困难,我们提出了 \textit{Individually Inferred Communication} (I2C),这是一种简单而有效的模型,使代理能够学习代理与代理通信的先验。先验知识是通过因果推理学习的,并通过前馈神经网络实现,该网络将代理的局部观察映射到关于与谁通信的信念。通过多智能体强化学习中的联合动作-价值函数推断一个智能体对另一个智能体的影响,并量化以标记智能体-智能体通信的必要性。此外,代理策略被规范化以更好地利用通信的消息。根据经验,我们表明,与现有方法相比,I2C 不仅可以减少通信开销,还可以提高各种多代理协作场景中的性能。