Should I remember more than you? Best responses to factored strategies,International Journal of Game Theory

当前位置： X-MOL 学术 › Int. J. Game Theory › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Should I remember more than you? Best responses to factored strategies
International Journal of Game Theory ( IF 0.6 ) Pub Date : 2020-10-21 , DOI: 10.1007/s00182-020-00733-1
René Levínský , Abraham Neyman , Miroslav Zelený

In this paper we offer a new, unifying approach to modeling strategies of bounded complexity. In our model, the strategy of a player in a game does not directly map the set H of histories to the set of her actions. Instead, the player’s perception of H is represented by a map $$\varphi :H \rightarrow X,$$ where X reflects the “cognitive complexity” of the player, and the strategy chooses its mixed action at history h as a function of $$\varphi (h)$$ . In this case we say that $$\varphi $$ is a factor of a strategy and that the strategy is $$\varphi $$ -factored. Stationary strategies, strategies played by finite automata, and strategies with bounded recall are the most prominent examples of factored strategies in multistage games. A factor $$\varphi $$ is recursive if its value at history $$h'$$ that follows history h is a function of $$\varphi (h)$$ and the incremental information $$h'\setminus h$$ . For example, in a repeated game with perfect monitoring, a factor $$\varphi $$ is recursive if its value $$\varphi (a_1,\ldots ,a_t)$$ on a finite string of action profiles $$(a_1,\ldots ,a_t)$$ is a function of $$\varphi (a_1,\ldots ,a_{t-1})$$ and $$a_t$$ .We prove that in a discounted infinitely repeated game and (more generally) in a stochastic game with finitely many actions and perfect monitoring, if the factor $$\varphi $$ is recursive, then for every profile of $$\varphi $$ -factored strategies there is a pure $$\varphi $$ -factored strategy that is a best reply, and if the stochastic game has finitely many states and actions and the factor $$\varphi $$ has a finite range then there is a pure $$\varphi $$ -factored strategy that is a best reply in all the discounted games with a sufficiently large discount factor.

中文翻译：

我应该记得比你多吗？对因子策略的最佳反应

在本文中，我们提供了一种新的、统一的方法来建模有界复杂度的策略。在我们的模型中，玩家在游戏中的策略并不直接将历史集 H 映射到她的行为集。相反，玩家对 H 的感知由地图 $$\varphi :H \rightarrow X,$$ 表示，其中 X 反映了玩家的“认知复杂性”，并且策略选择其在历史 h 的混合动作作为$$\varphi (h)$$ 。在这种情况下，我们说 $$\varphi $$ 是策略的一个因子，并且该策略是 $$\varphi $$ -factored。静止策略、有限自动机的策略和有界召回策略是多阶段博弈中因子策略的最突出例子。一个因子 $$\varphi $$ 是递归的，如果它的历史值 $$h' 跟随历史 h 的 $$ 是 $$\varphi (h)$$ 和增量信息 $$h'\setminus h$$ 的函数。例如，在具有完美监控的重复游戏中，如果因子 $$\varphi $$ 在有限的动作配置文件字符串 $$(a_1, \ldots ,a_t)$$ 是 $$\varphi (a_1,\ldots ,a_{t-1})$$ 和 $$a_t$$ 的函数。 ) 在具有有限多个动作和完美监控的随机博弈中，如果因子 $$\varphi $$ 是递归的，那么对于 $$\varphi $$ - 因子策略的每个配置文件，都有一个纯 $$\varphi $$ -因子策略是最好的回答，

更新日期：2020-10-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11