当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information
arXiv - CS - Computer Science and Game Theory Pub Date : 2020-03-26 , DOI: arxiv-2003.11727
Le Cong Dinh, Long Tran-Thanh, Tri-Dung Nguyen, Alain B. Zemkoho

This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.

中文翻译:

信息不对称重复博弈中的最后一轮收敛与无瞬间后悔

本文考虑了重复博弈,其中一个玩家比其他玩家拥有更多关于游戏的信息。特别是,我们研究了重复的两人零和游戏,其中只有列玩家知道游戏的收益矩阵 A。假设在重复玩这个游戏时,排玩家在每一轮选择她的策略,通过使用无后悔算法来最小化她的(伪)后悔。我们为列球员开发了一种无即时后悔算法,以展示最后一轮收敛到极小极大均衡。我们表明我们的算法对行播放器的大量流行的无悔算法是有效的,包括乘法权重更新算法,在线镜像下降法/跟随正则化领导者,线性乘法权重更新算法,
更新日期:2020-03-27
down
wechat
bug