Local Information Opponent Modelling Using Variational Autoencoders,arXiv - CS - Multiagent Systems

当前位置： X-MOL 学术 › arXiv.cs.MA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Local Information Opponent Modelling Using Variational Autoencoders
arXiv - CS - Multiagent Systems Pub Date : 2020-06-16 , DOI: arxiv-2006.09447
Georgios Papoudakis, Filippos Christianos, Stefano V. Albrecht

Modelling the behaviours of other agents (opponents) is essential for understanding how agents interact and making effective decisions. Existing methods for opponent modelling commonly assume knowledge of the local observations and chosen actions of the modelled opponents, which can significantly limit their applicability. We propose a new modelling technique based on variational autoencoders, which are trained to reconstruct the local actions and observations of the opponent based on embeddings which depend only on the local observations of the modelling agent (its observed world state, chosen actions, and received rewards). The embeddings are used to augment the modelling agent's decision policy which is trained via deep reinforcement learning; thus the policy does not require access to opponent observations. We provide a comprehensive evaluation and ablation study in diverse multi-agent tasks, showing that our method achieves comparable performance to an ideal baseline which has full access to opponent's information, and significantly higher returns than a baseline method which does not use the learned embeddings.

中文翻译：

使用变分自编码器的本地信息对手建模

对其他代理（对手）的行为进行建模对于理解代理如何交互和做出有效决策至关重要。现有的对手建模方法通常假设了解建模对手的局部观察和选择的动作，这会显着限制它们的适用性。我们提出了一种基于变分自编码器的新建模技术，该技术经过训练以基于仅依赖建模代理的局部观察（其观察到的世界状态、选择的动作和收到的奖励）的嵌入来重建对手的局部动作和观察）。嵌入用于增强通过深度强化学习训练的建模代理的决策策略；因此该策略不需要访问对手的观察。

更新日期：2020-10-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>