Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability,arXiv - CS - Computer Science and Game Theory

当前位置： X-MOL 学术 › arXiv.cs.GT › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-agent Bayesian Learning with Best Response Dynamics: Convergence and Stability
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-09-02 , DOI: arxiv-2109.00719
Manxi Wu, Saurabh Amin, Asuman Ozdaglar

We study learning dynamics induced by strategic agents who repeatedly play a game with an unknown payoff-relevant parameter. In this dynamics, a belief estimate of the parameter is repeatedly updated given players' strategies and realized payoffs using Bayes's rule. Players adjust their strategies by accounting for best response strategies given the belief. We show that, with probability 1, beliefs and strategies converge to a fixed point, where the belief consistently estimates the payoff distribution for the strategy, and the strategy is an equilibrium corresponding to the belief. However, learning may not always identify the unknown parameter because the belief estimate relies on the game outcomes that are endogenously generated by players' strategies. We obtain sufficient and necessary conditions, under which learning leads to a globally stable fixed point that is a complete information Nash equilibrium. We also provide sufficient conditions that guarantee local stability of fixed point beliefs and strategies.

中文翻译：

具有最佳响应动态的多智能体贝叶斯学习：收敛性和稳定性

我们研究了由反复玩具有未知收益相关参数的游戏的战略代理引起的学习动态。在这种动态中，给定参与者的策略和使用贝叶斯规则实现的收益，参数的置信估计会重复更新。玩家通过考虑给定信念的最佳反应策略来调整他们的策略。我们表明，在概率为 1 的情况下，信念和策略收敛到一个固定点，其中信念一致地估计策略的收益分布，并且策略是与信念对应的均衡。然而，学习可能并不总是识别未知参数，因为信念估计依赖于由玩家策略内生产生的游戏结果。我们获得充分必要条件，在这种情况下，学习导致全局稳定不动点，即完全信息纳什均衡。我们还提供了保证不动点信念和策略的局部稳定性的充分条件。

更新日期：2021-09-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>