当前位置: X-MOL 学术arXiv.cs.CC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games
arXiv - CS - Computational Complexity Pub Date : 2021-09-04 , DOI: arxiv-2109.01795
Xiaotie Deng, Yuhao Li, David Henry Mguni, Jun Wang, Yaodong Yang

Similar to the role of Markov decision processes in reinforcement learning, Stochastic Games (SGs) lay the foundation for the study of multi-agent reinforcement learning (MARL) and sequential agent interactions. In this paper, we derive that computing an approximate Markov Perfect Equilibrium (MPE) in a finite-state discounted Stochastic Game within the exponential precision is \textbf{PPAD}-complete. We adopt a function with a polynomially bounded description in the strategy space to convert the MPE computation to a fixed-point problem, even though the stochastic game may demand an exponential number of pure strategies, in the number of states, for each agent. The completeness result follows the reduction of the fixed-point problem to {\sc End of the Line}. Our results indicate that finding an MPE in SGs is highly unlikely to be \textbf{NP}-hard unless \textbf{NP}=\textbf{co-NP}. Our work offers confidence for MARL research to study MPE computation on general-sum SGs and to develop fruitful algorithms as currently on zero-sum SGs.

中文翻译:

关于计算广义和随机博弈中马尔可夫完美均衡的复杂性

类似于马尔可夫决策过程在强化学习中的作用,随机博弈 (SG) 为多智能体强化学习 (MARL) 和序列智能体交互的研究奠定了基础。在本文中,我们推导出在指数精度内的有限状态贴现随机博弈中计算近似马尔可夫完美均衡 (MPE) 是 \textbf{PPAD} 完全的。我们在策略空间中采用具有多项式有界描述的函数将 MPE 计算转换为定点问题,即使随机游戏可能需要指数数量的纯策略,在状态数量上,对于每个代理。完整性结果遵循将定点问题简化为 {\sc End of the Line}。我们的结果表明,除非 \textbf{NP}=\textbf{co-NP},否则在 SG 中找到 MPE 极不可能是 \textbf{NP}-hard。我们的工作为 MARL 研究提供了信心,以研究一般和 SG 上的 MPE 计算,并开发目前在零和 SG 上富有成效的算法。
更新日期:2021-09-07
down
wechat
bug