当前位置: X-MOL 学术IEEE Trans. Games › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evolving Cooperation for the Iterated Prisoner's Dilemma
IEEE Transactions on Games ( IF 2.3 ) Pub Date : 2020-01-01 , DOI: 10.1109/tg.2020.3005124
Jessie Finocchiaro , H. David Mathias

The Iterated Prisoner’s Dilemma (IPD) has been studied in fields as diverse as economics, computer science, psychology, politics, and environmental studies. This is due, in part, to the intriguing property that its Nash Equilibrium is not globally optimal. Typically treated as a single-objective problem, a player’s goal is to maximize their own score. In some work, minimizing the opponent’s score is an additional objective. Here, we explore the role of explicitly optimizing for mutual cooperation in IPD player performance. We implement a genetic algorithm in which each member of the population evolves using one of four multi-objective fitness functions: selfish, communal, cooperative, and selfless, the last three of which use a cooperative metric as an objective. As a control, we also consider two singleobjective fitness functions. We explore the role of representation in evolving cooperation by implementing four representations for evolving players. Finally, we evaluate the effect of noise on the evolution of cooperative behaviors. Testing our evolved players in tournaments in which a player’s own score is the sole metic, we find that players evolved with mutual cooperation as an objective are very competitive. Thus, learning to play nicely with others is a successful strategy for maximizing personal reward.

中文翻译:

迭代囚徒困境的演化合作

重复囚徒困境 (IPD) 的研究领域涉及经济学、计算机科学、心理学、政治学和环境研究等多个领域。这部分是由于纳什均衡不是全局最优的这一有趣特性。通常被视为单目标问题,玩家的目标是最大化自己的分数。在某些工作中,最小化对手的分数是一个额外的目标。在这里,我们探讨了明确优化相互合作在 IPD 播放器性能中的作用。我们实现了一种遗传算法,其中种群的每个成员使用四个多目标适应度函数之一进行进化:自私、公共、合作和无私,其中后三个使用合作度量作为目标。作为对照,我们还考虑了两个单目标适应度函数。我们通过为不断发展的参与者实施四种代表来探索代表在不断发展的合作中的作用。最后,我们评估了噪声对合作行为演变的影响。在以玩家自己的分数为唯一指标的锦标赛中测试我们进化的玩家,我们发现以相互合作为目标进化的玩家非常有竞争力。因此,学会与他人友好相处是最大化个人奖励的成功策略。
更新日期:2020-01-01
down
wechat
bug