当前位置:
X-MOL 学术
›
arXiv.cs.GT
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Decentralized Learning Dynamics for Extensive-Form Coarse Correlated Equilibrium: No Expensive Computation of Stationary Distributions Required
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-09-16 , DOI: arxiv-2109.08138 Gabriele Farina, Andrea Celli, Tuomas Sandholm
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-09-16 , DOI: arxiv-2109.08138 Gabriele Farina, Andrea Celli, Tuomas Sandholm
While in two-player zero-sum games the Nash equilibrium is a well-established
prescriptive notion of optimal play, its applicability as a prescriptive tool
beyond that setting is limited. Consequently, the study of decentralized
learning dynamics that guarantee convergence to correlated solution concepts in
multiplayer, general-sum extensive-form (i.e., tree-form) games has become an
important topic of active research. The per-iteration complexity of the
currently known learning dynamics depends on the specific correlated solution
concept considered. For example, in the case of extensive-form correlated
equilibrium (EFCE), all known dynamics require, as an intermediate step at each
iteration, to compute the stationary distribution of multiple Markov chains, an
expensive operation in practice. Oppositely, in the case of normal-form coarse
correlated equilibrium (NFCCE), simple no-external-regret learning dynamics
that amount to a linear-time traversal of the tree-form decision space of each
agent suffice to guarantee convergence. This paper focuses on extensive-form
coarse correlated equilibrium (EFCCE), an intermediate solution concept that is
a subset of NFCCE and a superset of EFCE. Being a superset of EFCE, any
learning dynamics for EFCE automatically guarantees convergence to EFCCE.
However, since EFCCE is a simpler solution concept, this begs the question: do
learning dynamics for EFCCE that avoid the expensive computation of stationary
distributions exist? This paper answers the previous question in the positive.
Our learning dynamics only require the orchestration of no-external-regret
minimizers, thus showing that EFCCE is more akin to NFCCE than to EFCE from a
learning perspective. Our dynamics guarantees that the empirical frequency of
play after $T$ iteration is a $O(1/\sqrt{T})$-approximate EFCCE with high
probability, and an EFCCE almost surely in the limit.
中文翻译:
广泛形式粗相关均衡的高效分散学习动态:无需昂贵的平稳分布计算
虽然在两人零和博弈中,纳什均衡是一个完善的规定性最优博弈概念,但它作为超出该设置的规定性工具的适用性是有限的。因此,保证在多人、一般和扩展形式(即树形)游戏中收敛到相关解决方案概念的分散学习动态的研究已成为活跃研究的重要课题。当前已知学习动态的每次迭代复杂性取决于所考虑的特定相关解决方案概念。例如,在扩展形式相关均衡 (EFCE) 的情况下,所有已知的动力学都需要,作为每次迭代的中间步骤,计算多个马尔可夫链的平稳分布,这在实践中是一项昂贵的操作。相反,在范式粗相关均衡 (NFCCE) 的情况下,简单的无外部后悔学习动态相当于每个代理的树形决策空间的线性时间遍历足以保证收敛。本文侧重于扩展形式的粗相关均衡 (EFCCE),这是一种中间解概念,它是 NFCCE 的子集和 EFCE 的超集。作为 EFCE 的超集,EFCE 的任何学习动态都会自动保证收敛到 EFCCE。然而,由于 EFCCE 是一个更简单的解决方案概念,这就引出了一个问题:EFCCE 的学习动态是否存在避免了平稳分布的昂贵计算?本文正面回答了前面的问题。我们的学习动态只需要无外部后悔最小化器的编排,因此,从学习的角度来看,EFCCE 更类似于 NFCCE 而不是 EFCE。我们的动态保证了 $T$ 迭代后的经验播放频率是 $O(1/\sqrt{T})$-近似的 EFCCE,概率很高,并且 EFCCE 几乎肯定在极限内。
更新日期:2021-09-17
中文翻译:
广泛形式粗相关均衡的高效分散学习动态:无需昂贵的平稳分布计算
虽然在两人零和博弈中,纳什均衡是一个完善的规定性最优博弈概念,但它作为超出该设置的规定性工具的适用性是有限的。因此,保证在多人、一般和扩展形式(即树形)游戏中收敛到相关解决方案概念的分散学习动态的研究已成为活跃研究的重要课题。当前已知学习动态的每次迭代复杂性取决于所考虑的特定相关解决方案概念。例如,在扩展形式相关均衡 (EFCE) 的情况下,所有已知的动力学都需要,作为每次迭代的中间步骤,计算多个马尔可夫链的平稳分布,这在实践中是一项昂贵的操作。相反,在范式粗相关均衡 (NFCCE) 的情况下,简单的无外部后悔学习动态相当于每个代理的树形决策空间的线性时间遍历足以保证收敛。本文侧重于扩展形式的粗相关均衡 (EFCCE),这是一种中间解概念,它是 NFCCE 的子集和 EFCE 的超集。作为 EFCE 的超集,EFCE 的任何学习动态都会自动保证收敛到 EFCCE。然而,由于 EFCCE 是一个更简单的解决方案概念,这就引出了一个问题:EFCCE 的学习动态是否存在避免了平稳分布的昂贵计算?本文正面回答了前面的问题。我们的学习动态只需要无外部后悔最小化器的编排,因此,从学习的角度来看,EFCCE 更类似于 NFCCE 而不是 EFCE。我们的动态保证了 $T$ 迭代后的经验播放频率是 $O(1/\sqrt{T})$-近似的 EFCCE,概率很高,并且 EFCCE 几乎肯定在极限内。