PAC Reinforcement Learning Algorithm for General-Sum Markov Games,arXiv - CS - Computer Science and Game Theory

当前位置： X-MOL 学术 › arXiv.cs.GT › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PAC Reinforcement Learning Algorithm for General-Sum Markov Games
arXiv - CS - Computer Science and Game Theory Pub Date : 2020-09-05 , DOI: arxiv-2009.02605
Ashkan Zehfroosh and Herbert G. Tanner

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. The paper offers an extension to the well-known Nash Q-learning algorithm, using the idea of delayed Q-learning, in order to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate performance and robustness.

中文翻译：

通用和马尔可夫博弈的PAC强化学习算法

本文提出了一个用于马尔可夫博弈的可能近似正确 (PAC) 多智能体强化学习 (MARL) 算法的理论框架。该论文对著名的 Nash Q-learning 算法进行了扩展，使用延迟 Q-learning 的思想，以构建一种新的 PAC MARL 算法，用于一般和马尔可夫博弈。除了指导可证明的 PAC MARL 算法的设计之外，该框架还可以检查任意 MARL 算法是否是 PAC。比较数值结果证明了性能和稳健性。

更新日期：2020-09-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文