当前位置:
X-MOL 学术
›
arXiv.cs.GT
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Forward Looking Best-Response Multiplicative Weights Update Methods
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-06-07 , DOI: arxiv-2106.03579 Michail Fasoulakis, Evangelos Markakis, Yannis Pantazis, Constantinos Varsos
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-06-07 , DOI: arxiv-2106.03579 Michail Fasoulakis, Evangelos Markakis, Yannis Pantazis, Constantinos Varsos
We propose a novel variant of the \emph{multiplicative weights update method}
with forward-looking best-response strategies, that guarantees last-iterate
convergence for \emph{zero-sum games} with a unique \emph{Nash equilibrium}.
Particularly, we show that the proposed algorithm converges to an
$\eta^{1/\rho}$-approximate Nash equilibrium, with $\rho > 1$, by decreasing
the Kullback-Leibler divergence of each iterate by a rate of at least
$\Omega(\eta^{1+\frac{1}{\rho}})$, for sufficiently small learning rate $\eta$.
When our method enters a sufficiently small neighborhood of the solution, it
becomes a contraction and converges to the Nash equilibrium of the game.
Furthermore, we perform an experimental comparison with the recently proposed
optimistic variant of the multiplicative weights update method, by
\cite{Daskalakis2019LastIterateCZ}, which has also been proved to attain
last-iterate convergence. Our findings reveal that our algorithm offers
substantial gains both in terms of the convergence rate and the region of
contraction relative to the previous approach.
中文翻译:
前瞻性最佳响应乘法权重更新方法
我们提出了一种具有前瞻性最佳响应策略的 \emph {乘法权重更新方法} 的新颖变体,它保证了具有独特 \emph {纳什均衡} 的 \emph {零和游戏} 的最后迭代收敛。特别是,我们证明了所提出的算法收敛到 $\eta^{1/\rho}$-approximate Nash equilibrium,其中 $\rho > 1$,通过减少每次迭代的 Kullback-Leibler 散度为至少 $\Omega(\eta^{1+\frac{1}{\rho}})$,对于足够小的学习率 $\eta$。当我们的方法进入解的一个足够小的邻域时,它变成一个收缩并收敛到博弈的纳什均衡。此外,我们通过 \cite{Daskalakis2019LastIterateCZ} 与最近提出的乘法权重更新方法的乐观变体进行了实验比较,这也被证明可以达到最后迭代收敛。我们的研究结果表明,相对于先前的方法,我们的算法在收敛速度和收缩区域方面都提供了巨大的收益。
更新日期:2021-06-08
中文翻译:
前瞻性最佳响应乘法权重更新方法
我们提出了一种具有前瞻性最佳响应策略的 \emph {乘法权重更新方法} 的新颖变体,它保证了具有独特 \emph {纳什均衡} 的 \emph {零和游戏} 的最后迭代收敛。特别是,我们证明了所提出的算法收敛到 $\eta^{1/\rho}$-approximate Nash equilibrium,其中 $\rho > 1$,通过减少每次迭代的 Kullback-Leibler 散度为至少 $\Omega(\eta^{1+\frac{1}{\rho}})$,对于足够小的学习率 $\eta$。当我们的方法进入解的一个足够小的邻域时,它变成一个收缩并收敛到博弈的纳什均衡。此外,我们通过 \cite{Daskalakis2019LastIterateCZ} 与最近提出的乘法权重更新方法的乐观变体进行了实验比较,这也被证明可以达到最后迭代收敛。我们的研究结果表明,相对于先前的方法,我们的算法在收敛速度和收缩区域方面都提供了巨大的收益。