当前位置: X-MOL 学术arXiv.cs.MA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-Agent Reinforcement Learning in Cournot Games
arXiv - CS - Multiagent Systems Pub Date : 2020-09-14 , DOI: arxiv-2009.06224
Yuanyuan Shi, Baosen Zhang

In this work, we study the interaction of strategic agents in continuous action Cournot games with limited information feedback. Cournot game is the essential market model for many socio-economic systems where agents learn and compete without the full knowledge of the system or each other. We consider the dynamics of the policy gradient algorithm, which is a widely adopted continuous control reinforcement learning algorithm, in concave Cournot games. We prove the convergence of policy gradient dynamics to the Nash equilibrium when the price function is linear or the number of agents is two. This is the first result (to the best of our knowledge) on the convergence property of learning algorithms with continuous action spaces that do not fall in the no-regret class.

中文翻译:

古诺游戏中的多智能体强化学习

在这项工作中,我们研究了策略代理在信息反馈有限的连续动作古诺博弈中的相互作用。古诺博弈是许多社会经济系统的基本市场模型,在这些系统中,代理人在不完全了解系统或彼此之间学习和竞争的情况下。我们在凹古诺博弈中考虑策略梯度算法的动力学,该算法是一种广泛采用的连续控制强化学习算法。我们证明了当价格函数是线性的或代理人数量为 2 时,政策梯度动态收敛到纳什均衡。这是关于具有不属于无后悔类别的连续动作空间的学习算法的收敛特性的第一个结果(据我们所知)。
更新日期:2020-09-15
down
wechat
bug