当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PAC Reinforcement Learning Algorithm for General-Sum Markov Games
arXiv - CS - Computer Science and Game Theory Pub Date : 2020-09-05 , DOI: arxiv-2009.02605
Ashkan Zehfroosh and Herbert G. Tanner

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. The paper offers an extension to the well-known Nash Q-learning algorithm, using the idea of delayed Q-learning, in order to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate performance and robustness.

中文翻译:

通用和马尔可夫博弈的PAC强化学习算法

本文提出了一个用于马尔可夫博弈的可能近似正确 (PAC) 多智能体强化学习 (MARL) 算法的理论框架。该论文对著名的 Nash Q-learning 算法进行了扩展,使用延迟 Q-learning 的思想,以构建一种新的 PAC MARL 算法,用于一般和马尔可夫博弈。除了指导可证明的 PAC MARL 算法的设计之外,该框架还可以检查任意 MARL 算法是否是 PAC。比较数值结果证明了性能和稳健性。
更新日期:2020-09-09
down
wechat
bug