Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games,arXiv - CS - Multiagent Systems

当前位置： X-MOL 学术 › arXiv.cs.MA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games
arXiv - CS - Multiagent Systems Pub Date : 2020-09-21 , DOI: arxiv-2009.10061
Gabriele Farina and Andrea Celli and Nicola Gatti and Tuomas Sandholm

We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game. Team members are not allowed to communicate during play but can coordinate before the game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensive-form correlation. Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile. We can also cap the number of such profiles we allow in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of well-chosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a relatively simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, we develop an efficient column-generation algorithm for finding an optimal distribution for the team. We evaluate it on a suite of common benchmark games. It is three orders of magnitude faster than the prior state of the art on games that the latter can solve and it can also solve several games that were previously unsolvable.

中文翻译：

广义零和博弈中最优事前协调共谋策略的更快算法

我们专注于在不完美信息零和扩展形式博弈中为面对对手的两个玩家的团队寻找最佳策略的问题。队员在比赛中不允许交流，但可以在比赛前进行协调。在这种情况下，众所周知，团队可以做的最好的事情是在游戏开始时从联合（也称为相关）概率分布中采样潜在随机策略的配置文件（每个玩家一个）。在本文中，我们首先提供了有关计算这种最优分布的新建模结果，方法是与不同的扩展形式相关文献建立联系。其次，我们提供了一种算法，该算法仅使用配置文件来计算这种最佳分布，其中只有一个团队成员可以在每个配置文件中随机化。我们还可以限制解决方案中允许的此类配置文件的数量。这通过增加上限产生了任何时间算法。我们发现，通常一些精心挑选的此类配置文件足以为团队实现最佳效用。这使团队成员能够通过相对简单易懂的计划达成协调。最后，受到这一观察的启发并利用我们介绍的理论概念，我们开发了一种高效的列生成算法，用于为团队寻找最佳分布。我们在一套常见的基准游戏上对其进行评估。它比后者可以解决的游戏的现有技术水平快三个数量级，它还可以解决以前无法解决的几个游戏。我们发现，通常一些精心挑选的此类配置文件足以为团队实现最佳效用。这使团队成员能够通过相对简单易懂的计划达成协调。最后，受到这一观察的启发并利用我们介绍的理论概念，我们开发了一种高效的列生成算法，用于为团队寻找最佳分布。我们在一套常见的基准游戏上对其进行评估。它比后者可以解决的游戏的现有技术水平快三个数量级，它还可以解决以前无法解决的几个游戏。我们发现，通常一些精心挑选的此类配置文件足以为团队实现最佳效用。这使团队成员能够通过相对简单易懂的计划达成协调。最后，受到这一观察的启发并利用我们介绍的理论概念，我们开发了一种高效的列生成算法，用于为团队寻找最佳分布。我们在一套常见的基准游戏上对其进行评估。它比后者可以解决的游戏的现有技术水平快三个数量级，它还可以解决以前无法解决的几个游戏。受到这一观察的启发，并利用我们介绍的理论概念，我们开发了一种高效的列生成算法，用于为团队寻找最佳分布。我们在一套常见的基准游戏上对其进行评估。它比后者可以解决的游戏的现有技术水平快三个数量级，它还可以解决以前无法解决的几个游戏。受到这一观察的启发，并利用我们介绍的理论概念，我们开发了一种高效的列生成算法，用于为团队寻找最佳分布。我们在一套常见的基准游戏上对其进行评估。它比后者可以解决的游戏的现有技术水平快三个数量级，它还可以解决以前无法解决的几个游戏。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文