Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping
arXiv - CS - Artificial Intelligence Pub Date : 2020-01-15 , DOI: arxiv-2001.07527
Eugenio Bargiacchi, Timothy Verstraeten, Diederik M. Roijers, Ann Now\'e

We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment and performs temporal difference updates which affect multiple joint states and actions at once. Batch updates are additionally performed which efficiently back-propagate knowledge throughout the factored Q-function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.

中文翻译：

基于模型的多智能体强化学习协同优先扫描

我们提出了一种新的基于模型的强化学习算法，协作优先扫描，用于在多智能体马尔可夫决策过程中进行有效学习。该算法通过利用因式分解来近似值函数，允许对大型问题进行样本高效学习。我们的方法只需要以动态决策网络的形式了解问题的结构。使用这些信息，我们的方法学习环境模型并执行同时影响多个联合状态和动作的时间差异更新。另外执行批量更新，这在整个分解的 Q 函数中有效地反向传播知识。我们的方法优于最先进的算法稀疏协作 Q 学习算法，

更新日期：2020-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文