当前位置:
X-MOL 学术
›
arXiv.cs.MA
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Open Ad Hoc Teamwork using Graph-based Policy Learning
arXiv - CS - Multiagent Systems Pub Date : 2020-06-18 , DOI: arxiv-2006.10412 Arrasy Rahman, Niklas Hopner, Filippos Christianos, Stefano V. Albrecht
arXiv - CS - Multiagent Systems Pub Date : 2020-06-18 , DOI: arxiv-2006.10412 Arrasy Rahman, Niklas Hopner, Filippos Christianos, Stefano V. Albrecht
Ad hoc teamwork is the challenging problem of designing an autonomous agent
which can adapt quickly to collaborate with previously unknown teammates. Prior
work in this area has focused on closed teams in which the number of agents is
fixed. In this work, we consider open teams by allowing agents of varying types
to enter and leave the team without prior notification. Our solution builds on
graph neural networks to learn agent models and joint action-value
decompositions under varying team sizes, which can be trained with
reinforcement learning using a discounted returns objective. We demonstrate
empirically that our approach effectively models the impact of other agents
actions on the controlled agent's returns to produce policies which can
robustly adapt to dynamic team composition and is able to effectively
generalize to larger teams than were seen during training.
中文翻译:
使用基于图的策略学习开放临时团队合作
临时团队合作是设计一个自主代理的具有挑战性的问题,该代理可以快速适应与以前未知的队友协作。该领域之前的工作主要集中在固定代理数量的封闭团队上。在这项工作中,我们通过允许不同类型的代理在没有事先通知的情况下进入和离开团队来考虑开放团队。我们的解决方案建立在图神经网络的基础上,以学习不同团队规模下的代理模型和联合动作价值分解,可以使用折扣回报目标通过强化学习进行训练。我们凭经验证明我们的方法有效地模拟了其他代理行为对受控代理的影响
更新日期:2020-10-19
中文翻译:
使用基于图的策略学习开放临时团队合作
临时团队合作是设计一个自主代理的具有挑战性的问题,该代理可以快速适应与以前未知的队友协作。该领域之前的工作主要集中在固定代理数量的封闭团队上。在这项工作中,我们通过允许不同类型的代理在没有事先通知的情况下进入和离开团队来考虑开放团队。我们的解决方案建立在图神经网络的基础上,以学习不同团队规模下的代理模型和联合动作价值分解,可以使用折扣回报目标通过强化学习进行训练。我们凭经验证明我们的方法有效地模拟了其他代理行为对受控代理的影响