当前位置:
X-MOL 学术
›
arXiv.cs.GT
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Stochastic Control Approach to Reputation Games
arXiv - CS - Computer Science and Game Theory Pub Date : 2016-04-01 , DOI: arxiv-1604.00299 Nuh Ayg\"un Dalk{\i}ran and Serdar Y\"uksel
arXiv - CS - Computer Science and Game Theory Pub Date : 2016-04-01 , DOI: arxiv-1604.00299 Nuh Ayg\"un Dalk{\i}ran and Serdar Y\"uksel
Through a stochastic control theoretic approach, we analyze reputation games
where a strategic long-lived player acts in a sequential repeated game against
a collection of short-lived players. The key assumption in our model is that
the information of the short-lived players is nested in that of the long-lived
player. This nested information structure is obtained through an appropriate
monitoring structure. Under this monitoring structure, we show that, given mild
assumptions, the set of Perfect Bayesian Equilibrium payoffs coincide with
Markov Perfect Equilibrium payoffs, and hence a dynamic programming formulation
can be obtained for the computation of equilibrium strategies of the strategic
long-lived player in the discounted setup. We also consider the undiscounted
average-payoff setup where we obtain an optimal equilibrium strategy of the
strategic long-lived player under further technical conditions. We then use
this optimal strategy in the undiscounted setup as a tool to obtain a tight
upper payoff bound for the arbitrarily patient long-lived player in the
discounted setup. Finally, by using measure concentration techniques, we obtain
a refined lower payoff bound on the value of reputation in the discounted
setup. We also study the continuity of equilibrium payoffs in the prior
beliefs.
中文翻译:
声誉博弈的随机控制方法
通过随机控制理论方法,我们分析了声誉博弈,在这种博弈中,战略性长期玩家在与短期玩家集合的连续重复博弈中行动。我们模型中的关键假设是短期玩家的信息嵌套在长期玩家的信息中。这种嵌套的信息结构是通过适当的监控结构获得的。在这种监控结构下,我们表明,在假设温和的情况下,完美贝叶斯均衡收益集与马尔可夫完美均衡收益一致,因此可以获得动态规划公式来计算战略长期参与者的均衡策略。折扣设置。我们还考虑了未贴现的平均收益设置,我们在进一步的技术条件下获得了战略长期参与者的最佳均衡策略。然后,我们在未打折设置中使用此最佳策略作为工具,为打折设置中任意耐心的长期玩家获得严格的回报上限。最后,通过使用度量集中技术,我们在折扣设置中获得了声誉值的精细下支付界限。我们还研究了先验信念中均衡收益的连续性。我们在折扣设置中获得了声誉值的精细收益下限。我们还研究了先验信念中均衡收益的连续性。我们在折扣设置中获得了声誉值的精细收益下限。我们还研究了先验信念中均衡收益的连续性。
更新日期:2020-01-22
中文翻译:
声誉博弈的随机控制方法
通过随机控制理论方法,我们分析了声誉博弈,在这种博弈中,战略性长期玩家在与短期玩家集合的连续重复博弈中行动。我们模型中的关键假设是短期玩家的信息嵌套在长期玩家的信息中。这种嵌套的信息结构是通过适当的监控结构获得的。在这种监控结构下,我们表明,在假设温和的情况下,完美贝叶斯均衡收益集与马尔可夫完美均衡收益一致,因此可以获得动态规划公式来计算战略长期参与者的均衡策略。折扣设置。我们还考虑了未贴现的平均收益设置,我们在进一步的技术条件下获得了战略长期参与者的最佳均衡策略。然后,我们在未打折设置中使用此最佳策略作为工具,为打折设置中任意耐心的长期玩家获得严格的回报上限。最后,通过使用度量集中技术,我们在折扣设置中获得了声誉值的精细下支付界限。我们还研究了先验信念中均衡收益的连续性。我们在折扣设置中获得了声誉值的精细收益下限。我们还研究了先验信念中均衡收益的连续性。我们在折扣设置中获得了声誉值的精细收益下限。我们还研究了先验信念中均衡收益的连续性。