Performance Analysis of Trial and Error Algorithms,IEEE Transactions on Parallel and Distributed Systems

当前位置： X-MOL 学术 › IEEE Trans. Parallel Distrib. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Performance Analysis of Trial and Error Algorithms
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2020-06-01 , DOI: 10.1109/tpds.2020.2964256
Jerome Gaveau , Christophe J. Le Martret , Mohamad Assaad

Model-free decentralized optimizations and learning are receiving increasing attention from theoretical and practical perspectives. In particular, two fully decentralized learning algorithms, namely Trial and Error Learning (TEL) and Optimal Dynamical Learning (ODL), are very appealing for a broad class of games. Indeed, ODL has the property to spend a high proportion of time in an optimum state that maximizes the sum of the utilities of all players, whereas, TEL has the property to spend a high proportion of time in an optimum state that maximizes the sum of the utilities of all players if there is a pure Nash equilibrium, otherwise, it spends a high proportion of time in a state that maximizes a trade-off between the sum of the utilities of the players and a predefined stability function. On the other hand, estimating the mean fraction of time spent in the optimum state (as well as the mean time duration to reach it) is challenging due to the high complexity and dimension of the inherent Markov chains. In this article, under some specific system model, an evaluation of the above performance metrics is provided by proposing an approximation of the considered Markov chains, which allows overcoming the problem of high dimensionality. A comparison between the two algorithms is then performed which allows a better understanding of their performance.

中文翻译：

试错算法的性能分析

从理论和实践的角度来看，无模型分散优化和学习越来越受到关注。特别是，两种完全去中心化的学习算法，即试错学习 (TEL) 和最优动态学习 (ODL)，对广泛的游戏类别非常有吸引力。实际上，ODL 具有将大部分时间花费在使所有参与者的效用总和最大化的最佳状态的特性，而 TEL 具有将大部分时间花费在使所有参与者的效用总和最大化的最佳状态的特性。如果存在纯纳什均衡，则所有参与者的效用，否则，它将花费大量时间在最大化参与者效用总和与预定义稳定性函数之间的权衡的状态。另一方面，由于固有马尔可夫链的高度复杂性和维度，估计处于最佳状态的平均时间分数（以及达到该状态的平均持续时间）具有挑战性。在本文中，在一些特定的系统模型下，通过提出所考虑的马尔可夫链的近似值来提供对上述性能指标的评估，这可以克服高维问题。然后执行两种算法之间的比较，以便更好地了解它们的性能。通过提出所考虑的马尔可夫链的近似值来提供对上述性能指标的评估，这可以克服高维问题。然后执行两种算法之间的比较，以便更好地了解它们的性能。通过提出所考虑的马尔可夫链的近似值来提供对上述性能指标的评估，这可以克服高维问题。然后执行两种算法之间的比较，以便更好地了解它们的性能。

更新日期：2020-06-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>