当前位置: X-MOL 学术J. Optim. Theory Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Proximal/Gradient Approach for Computing the Nash Equilibrium in Controllable Markov Games
Journal of Optimization Theory and Applications ( IF 1.9 ) Pub Date : 2021-01-20 , DOI: 10.1007/s10957-021-01812-3
Julio B. Clempner

This paper proposes a new algorithm for computing the Nash equilibrium based on an iterative approach of both the proximal and the gradient method for homogeneous, finite, ergodic and controllable Markov chains. We conceptualize the problem as a poly-linear programming problem. Then, we regularize the poly-linear functional employing a regularization approach over the Lagrange functional for ensuring the method to converge to some of the Nash equilibria of the game. This paper presents two main contributions: The first theoretical result is the proposed iterative approach, which employs both the proximal and the gradient method for computing the Nash equilibria in Markov games. The method transforms the game theory problem in a system of equations, in which each equation itself is an independent optimization problem for which the necessary condition of a minimum is computed employing a nonlinear programming solver. The iterated approach provides a quick rate of convergence to the Nash equilibrium point. The second computational contribution focuses on the analysis of the convergence of the proposed method and computes the rate of convergence of the step-size parameter. These results are interesting within the context of computational and algorithmic game theory. A numerical example illustrates the proposed approach.



中文翻译:

可控马尔可夫博弈中纳什均衡的近邻/梯度计算方法

本文提出了一种基于均值,有限,遍历和可控马尔可夫链的近端和梯度方法的迭代算法,用于计算纳什均衡。我们将该问题概念化为多线性规划问题。然后,我们对Lagrange函数采用正则化方法对多线性函数进行正则化,以确保该方法收敛到游戏的某些Nash均衡。本文提出了两个主要贡献:第一个理论结果是提出的迭代方法,该方法同时采用了近端方法和梯度方法来计算Markov博弈中的纳什均衡。该方法将博弈论问题转化为方程组,其中每个方程本身是一个独立的优化问题,使用非线性规划求解器可以计算出最小值的必要条件。迭代的方法可以快速收敛到Nash平衡点。第二个计算贡献集中在所提出方法的收敛性分析上,并计算步长参数的收敛率。这些结果在计算和算法博弈论的背景下很有趣。数值示例说明了所提出的方法。第二个计算贡献集中在所提出方法的收敛性分析上,并计算步长参数的收敛率。这些结果在计算和算法博弈论的背景下很有趣。数值示例说明了所提出的方法。第二个计算贡献集中在所提出方法的收敛性分析上,并计算步长参数的收敛率。这些结果在计算和算法博弈论的背景下很有趣。数值示例说明了所提出的方法。

更新日期:2021-01-20
down
wechat
bug