当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Forward Looking Best-Response Multiplicative Weights Update Methods
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-06-07 , DOI: arxiv-2106.03579
Michail Fasoulakis, Evangelos Markakis, Yannis Pantazis, Constantinos Varsos

We propose a novel variant of the \emph{multiplicative weights update method} with forward-looking best-response strategies, that guarantees last-iterate convergence for \emph{zero-sum games} with a unique \emph{Nash equilibrium}. Particularly, we show that the proposed algorithm converges to an $\eta^{1/\rho}$-approximate Nash equilibrium, with $\rho > 1$, by decreasing the Kullback-Leibler divergence of each iterate by a rate of at least $\Omega(\eta^{1+\frac{1}{\rho}})$, for sufficiently small learning rate $\eta$. When our method enters a sufficiently small neighborhood of the solution, it becomes a contraction and converges to the Nash equilibrium of the game. Furthermore, we perform an experimental comparison with the recently proposed optimistic variant of the multiplicative weights update method, by \cite{Daskalakis2019LastIterateCZ}, which has also been proved to attain last-iterate convergence. Our findings reveal that our algorithm offers substantial gains both in terms of the convergence rate and the region of contraction relative to the previous approach.

中文翻译:

前瞻性最佳响应乘法权重更新方法

我们提出了一种具有前瞻性最佳响应策略的 \emph {乘法权重更新方法} 的新颖变体,它保证了具有独特 \emph {纳什均衡} 的 \emph {零和游戏} 的最后迭代收敛。特别是,我们证明了所提出的算法收敛到 $\eta^{1/\rho}$-approximate Nash equilibrium,其中 $\rho > 1$,通过减少每次迭代的 Kullback-Leibler 散度为至少 $\Omega(\eta^{1+\frac{1}{\rho}})$,对于足够小的学习率 $\eta$。当我们的方法进入解的一个足够小的邻域时,它变成一个收缩并收敛到博弈的纳什均衡。此外,我们通过 \cite{Daskalakis2019LastIterateCZ} 与最近提出的乘法权重更新方法的乐观变体进行了实验比较,这也被证明可以达到最后迭代收敛。我们的研究结果表明,相对于先前的方法,我们的算法在收敛速度和收缩区域方面都提供了巨大的收益。
更新日期:2021-06-08
down
wechat
bug