当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Solving Structured Hierarchical Games Using Differential Backward Induction
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-06-08 , DOI: arxiv-2106.04663
Zun Li, Feiran Jia, Aditya Mate, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Yevgeniy Vorobeychik

Many real-world systems possess a hierarchical structure where a strategic plan is forwarded and implemented in a top-down manner. Examples include business activities in large companies or policy making for reducing the spread during pandemics. We introduce a novel class of games that we call structured hierarchical games (SHGs) to capture these strategic interactions. In an SHG, each player is represented as a vertex in a multi-layer decision tree and controls a real-valued action vector reacting to orders from its predecessors and influencing its descendants' behaviors strategically based on its own subjective utility. SHGs generalize extensive form games as well as Stackelberg games. For general SHGs with (possibly) nonconvex payoffs and high-dimensional action spaces, we propose a new solution concept which we call local subgame perfect equilibrium. By exploiting the hierarchical structure and strategic dependencies in payoffs, we derive a back propagation-style gradient-based algorithm which we call Differential Backward Induction to compute an equilibrium. We theoretically characterize the convergence properties of DBI and empirically demonstrate a large overlap between the stable points reached by DBI and equilibrium solutions. Finally, we demonstrate the effectiveness of our algorithm in finding \emph{globally} stable solutions and its scalability for a recently introduced class of SHGs for pandemic policy making.

中文翻译:

使用微分向后归纳解决结构化层次博弈

许多现实世界的系统都具有层次结构,其中以自上而下的方式转发和实施战略计划。例子包括大公司的商业活动或为减少大流行期间的传播而制定的政策。我们引入了一类新颖的游戏,我们称之为结构化分层游戏 (SHG) 来捕捉这些战略互动。在 SHG 中,每个玩家都被表示为多层决策树中的一个顶点,并控制一个实值动作向量,对其前辈的命令做出反应,并根据自己的主观效用有策略地影响其后代的行为。SHG 概括了广泛形式的博弈以及 Stackelberg 博弈。对于具有(可能)非凸收益和高维动作空间的一般 SHG,我们提出了一个新的解决方案概念,我们称之为局部子博弈完美均衡。通过利用收益中的层次结构和策略依赖关系,我们推导出一种基于反向传播式梯度的算法,我们将其称为差分反向归纳来计算均衡。我们从理论上描述了 DBI 的收敛特性,并凭经验证明了 DBI 达到的稳定点与平衡解之间存在很大的重叠。最后,我们证明了我们的算法在寻找 \emph {globally} 稳定解决方案方面的有效性及其对最近引入的一类用于大流行政策制定的 SHG 的可扩展性。我们推导出一种反向传播风格的基于梯度的算法,我们称之为微分反向归纳来计算平衡。我们从理论上描述了 DBI 的收敛特性,并凭经验证明了 DBI 达到的稳定点与平衡解之间存在很大的重叠。最后,我们证明了我们的算法在寻找 \emph {globally} 稳定解决方案方面的有效性及其对最近引入的一类用于大流行政策制定的 SHG 的可扩展性。我们推导出一种反向传播风格的基于梯度的算法,我们称之为微分反向归纳来计算平衡。我们从理论上描述了 DBI 的收敛特性,并凭经验证明了 DBI 达到的稳定点与平衡解之间存在很大的重叠。最后,我们证明了我们的算法在寻找 \emph {globally} 稳定解决方案方面的有效性及其对最近引入的一类用于大流行政策制定的 SHG 的可扩展性。
更新日期:2021-06-10
down
wechat
bug