当前位置:
X-MOL 学术
›
arXiv.cs.MA
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Implicit Communication as Minimum Entropy Coupling
arXiv - CS - Multiagent Systems Pub Date : 2021-07-17 , DOI: arxiv-2107.08295 Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Shimon Whiteson, Jakob Foerster
arXiv - CS - Multiagent Systems Pub Date : 2021-07-17 , DOI: arxiv-2107.08295 Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Shimon Whiteson, Jakob Foerster
In many common-payoff games, achieving good performance requires players to
develop protocols for communicating their private information implicitly --
i.e., using actions that have non-communicative effects on the environment.
Multi-agent reinforcement learning practitioners typically approach this
problem using independent learning methods in the hope that agents will learn
implicit communication as a byproduct of expected return maximization.
Unfortunately, independent learning methods are incapable of doing this in many
settings. In this work, we isolate the implicit communication problem by
identifying a class of partially observable common-payoff games, which we call
implicit referential games, whose difficulty can be attributed to implicit
communication. Next, we introduce a principled method based on minimum entropy
coupling that leverages the structure of implicit referential games, yielding a
new perspective on implicit communication. Lastly, we show that this method can
discover performant implicit communication protocols in settings with very
large spaces of messages.
中文翻译:
作为最小熵耦合的隐式通信
在许多共同收益的游戏中,要获得良好的表现,玩家需要开发协议来隐式地传达他们的私人信息——即,使用对环境有非交流影响的动作。多智能体强化学习从业者通常使用独立的学习方法来解决这个问题,希望智能体将学习隐式通信作为预期回报最大化的副产品。不幸的是,在许多情况下,独立学习方法无法做到这一点。在这项工作中,我们通过识别一类部分可观察的共同收益博弈来隔离隐式通信问题,我们将其称为隐式参考博弈,其难度可归因于隐式通信。下一个,我们引入了一种基于最小熵耦合的原理方法,该方法利用了隐式参考博弈的结构,为隐式通信提供了新的视角。最后,我们表明该方法可以在具有非常大的消息空间的设置中发现高性能的隐式通信协议。
更新日期:2021-07-20
中文翻译:
作为最小熵耦合的隐式通信
在许多共同收益的游戏中,要获得良好的表现,玩家需要开发协议来隐式地传达他们的私人信息——即,使用对环境有非交流影响的动作。多智能体强化学习从业者通常使用独立的学习方法来解决这个问题,希望智能体将学习隐式通信作为预期回报最大化的副产品。不幸的是,在许多情况下,独立学习方法无法做到这一点。在这项工作中,我们通过识别一类部分可观察的共同收益博弈来隔离隐式通信问题,我们将其称为隐式参考博弈,其难度可归因于隐式通信。下一个,我们引入了一种基于最小熵耦合的原理方法,该方法利用了隐式参考博弈的结构,为隐式通信提供了新的视角。最后,我们表明该方法可以在具有非常大的消息空间的设置中发现高性能的隐式通信协议。