Randomized matrix games in a finite population: Effect of stochastic fluctuations in the payoffs on the evolution of cooperation
Introduction
Cooperative behavior is a phenomenon that is widely observed in nature. However, natural selection tends to enhance selfish behavior through fierce competition. In order to explain the rationality of cooperation and its evolution in natural populations, a two-player game known as the Prisoner’s Dilemma (PD) has been widely studied as one of the most important theoretical frameworks (Axelrod and Hamilton, 1981, Maynard Smith, 1982, Axelrod, 1984, Poundstone, 1992, Nowak and Highfield, 2011). In an additive version of the PD game, cooperation takes the form of a donor who pays a cost for a recipient to get a benefit . Defection costs nothing and does not disqualify from receiving a benefit. Therefore, the payoff for cooperation never exceeds the payoff for defection (Nowak, 2006, Nowak and Sigmund, 2007). This is the case in more general versions of the PD game. Moreover, assuming random pairwise interactions in an infinite population and average payoffs as relative growth rates, the replicator equation (Taylor and Jonker, 1978) predicts global convergence to fixation of defection (Hofbauer and Sigmund, 1998).
In a finite population of constant size undergoing discrete, non overlapping generations according to a Wright–Fisher model and more general models with exchangeable reproduction schemes (Fisher, 1930, Wright, 1931, Cannings, 1974, Ewens, 2004, Lessard, 2011), the fixation probability for a neutral mutant type represented only once initially is just the inverse of the population size, that is, . If this probability becomes larger than in the presence of selection, then the mutant type has been said to be favored by selection (Nowak et al., 2004). Several mechanisms have been considered to explain how cooperation could be favored by natural selection assuming additive effects of average payoffs on fitness (Nowak and Sigmund, 2007). This is the case, for instance, for cooperation taking the form of the “Tit-for-Tat” strategy (Trivers, 1971, Axelrod and Hamilton, 1981, Axelrod, 1984) starting with cooperation in a repeated PD game between randomly chosen partners if the number of rounds exceeds some threshold value (Nowak et al., 2004). This is also the case in group-structured or graph-structured populations for modeling some social or geographical networks with local interactions (Ohtsuki et al., 2006). However, with a one-round PD game and constant payoffs in a well-mixed population, the fitness of cooperation never exceeds the fitness of defection, and, as a result, cooperation cannot be favored by selection.
In nature, there are changes not only in the composition of a population but also in the surrounding environment in which the population finds itself. These can affect the payoffs that individuals receive as a result of interactions with others. Randomness in evolutionary games can take several forms such as probabilistic encounter rules or mixed strategies depending or not on the replies of others (Taylor and Jonker, 1978, Eshel and Cavalli-Sforza, 1982, Hofbauer and Sigmund, 1998). Of particular interest are stochastic games which allow the environment to change in response to the players’ choices (Shapley, 1953, Fudenberg et al., 2012, Solan and Vieille, 2015, Hilbe et al., 2018). But also not to be forgotten are variations in payoffs caused by disturbances in the natural environment. These can be periodic, e.g., being seasonal or alternating day and night. But they can also be totally random as if occurring by accident (May, 1973, Kaplan et al., 1990, Lande et al., 2003). In the case of deterministically time-dependent payoffs in matrix games, for instance, Broom (2005) compares the time average of the population state and the interior Nash equilibrium of the average payoff matrix and shows that they can be arbitrarily far apart. With periodic payoffs, even stable periodic orbits can be found from arbitrary starting points (Uyttendaele et al., 2012). On the other hand, it is shown in Stollmeier and Nagler (2018) that under the effects of random environmental noise, an evolutionary game involving two strategies with a strategy having a higher expected payoff at any frequency than the other can reach a stationary distribution with both strategies co-existing.
In a matrix game, unless stochastic fluctuations in the environment are small enough to be ignored, it is more accurate to use random payoffs than constant payoffs. In particular, the introduction of random payoffs extends the classical PD game to a randomized PD game. In order to reveal how environmental noise can generally affect the evolutionary game dynamics in an infinite population, the concepts of stochastic evolutionary stability (SES) and stochastic convergence stability (SCS) have been investigated (Zheng et al., 2017, Zheng et al., 2018). Applying these concepts to a one-round randomized PD game in a well-mixed population, it can be shown that the evolution of cooperation tends to be more easily favored by natural selection if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2019).
On the other hand, in a population genetics framework for a large finite population, Karlin and Levikson (1974) have shown that, when the mean and variance of frequency-independent genotypic fitnesses are of the same order given by the inverse of the population size, the effect of the variance matters. Actually, variability in selection, meaning fluctuating selection intensities, produces a “drift effect” away from the fixation states.
In order to study the effect of stochastic fluctuations in a context of an evolutionary game in a large finite population, we consider in this paper a matrix game with random payoffs for two players using one of two strategies. After ascertaining a diffusion approximation for this model, we focus on the Randomized Prisoner’s Dilemma (RPD) with cooperation and defection as strategies, and we consider the probability of ultimate fixation of either strategy as a single mutant. Conditions that favor the evolution of cooperation are examined in detail in the case of independent payoffs such that the average effects of cooperation and defection are additive. A RPD game with random additive effects of cooperation and defection on the payoffs as well as a repeated RPD game are also studied.
Section snippets
The model
We consider a randomized matrix game with two strategies in a finite population of fixed finite size . The two possible pure strategies used by the individuals in the population are denoted by and . At time corresponding to some generation, the frequencies of and are given by and , respectively, while their payoffs in pairwise interactions are given by the entries of the 2 × 2 random game matrix Here, and are the payoffs to strategy
Diffusion approximation
Let be the change in the frequency of individuals that use strategy from time to time . Given , the first, second and fourth moments of can be calculated as (see Appendix A for details) and where and
Randomized Prisoner’s Dilemma (RPD)
Consider a random game matrix (1) with independent payoffs whose expected values determine a classical Prisoner’s Dilemma (PD). In this case, the population-scaled parameters in (3) verify for with , since the payoffs are uncorrelated, and with and , which defines a PD game. Then, we have a randomized Prisoner’s Dilemma (RPD) with strategies and corresponding to cooperation () and defection (), respectively.
Suppose that cooperation is
RPD with independent payoffs
In this section, we focus on a RPD game with independent payoffs whose expected values are such that This payoff matrix determines an additive PD game in which cooperation () incurs a fixed cost to the individual adopting it, but provides a fixed benefit to the opponent, while defection () incurs no cost at all.
In this case, the function in (34) is given by . Moreover, if , then it can be shown (see Appendix B for details) that the function in (27)
RPD with additive payoffs
In this section, we consider a RPD game with additive payoffs. At time , cooperation () incurs a random cost to the individual adopting it, but provides a random benefit to the opponent, while defection () incurs no cost at all, so that the random payoff matrix takes the form The main difference with the model in the previous section is that the payoffs are not independent.
Here, and are assumed to be random variables with
Repeated RPD
We turn now our attention to a RPD game that is repeated a random number of times. There are two pure actions, cooperation () and defection (), and the payoffs in a single round of interaction between two players at time are given by the random game matrix Here, and are the payoffs to action against and , respectively, while and are the corresponding payoffs to action against the same two actions. These payoffs are assumed to be independent
Discussion
Environmental noise in the payoffs of a matrix game may have important effects on the evolutionary dynamics, and even change the outcome of evolution. As a matter of fact, the dynamics is driven not only by the expected values of the payoffs but also by their variances. Variability in payoffs can push the time average of a population state far from its interior Nash equilibrium (Broom, 2005) or even change the stability of a fixation state (Stollmeier and Nagler, 2018). In the case of a
Acknowledgments
This research was supported in part by NSERC of Canada (Grant no. 8833) and Chinese Academy of Sciences President’s International Fellowship Initiative (Grant no. 2016VBA039). We thank two anonymous referees for helpful comments to improve this paper.
References (35)
Evolutionary games with variable payoffs
C. R. Biol.
(2005)- et al.
Temporal fluctuations in selection intensities: case of small population size
Theor. Popul. Biol.
(1974) Long-term stability from fixation probabilities in finite populations: New perspectives for ESS theory
Theor. Popul. Biol.
(2005)The Evolution of Cooperation
(1984)- et al.
The evolution of cooperation
Science
(1981) The latent roots of certain Markov chains arising in genetics: A new approach, I. Haploid models
Adv. Appl. Probab.
(1974)- et al.
Assortment of encounters and evolution of cooperativeness
Proc. Natl. Acad. Sci. USA
(1982) Mathematical Population Genetics: I Theoretical Introduction
(2004)The Genetical Theory of Natural Selection
(1930)- et al.
Slow to anger and fast to forgive: Cooperation in an uncertain world
Amer. Econ. Rev.
(2012)
Evolution of cooperation in stochastic games
Nature
The Theory of Evolution and Dynamical Systems
Diffusion models in population genetics
J. Appl. Probab.
Stochastic Population Dynamics in Ecology and Conservation
Uncertainty in payoffs for defection could be conducive to the evolution of cooperative behavior
Cited by (11)
The emergence of cooperative behavior based on random payoff and heterogeneity of concerning social image
2024, Chaos, Solitons and FractalsEvolution of cooperation with respect to fixation probabilities in multi-player games with random payoffs
2022, Theoretical Population BiologyCitation Excerpt :A stochastic version of the continuous-time replicator equation with a random noise added to the growth rate of every strategy was considered in Fudenberg and Harris (1992). Recently, the effect of stochastic changes in payoffs in discrete time was studied with particular attention to stochastic local stability of fixation states and constant polymorphic equilibria in an infinite population (Zheng et al., 2017, 2018), the fixation probability in a large population that reproduces according to a Wright–Fisher model (Li and Lessard, 2020) and the average abundance of strategies in a finite population that reproduces according to a Moran model (Kroumi and Lessard, 2021a,b). In general, variability in cost and benefit in social dilemmas introduces variances and covariances in payoffs for cooperation and defection.
Inclusive fitness and Hamilton's rule in a stochastic environment
2021, Theoretical Population BiologyCitation Excerpt :See also Taylor et al. (2007), Lessard (2009) for further results on inclusive fitness in finite structured populations. As for environmental stochasticity, refer to Gillespie (1973), Karlin and Levikson (1974), Karlin and Liberman (1974) for early contributions on the effect of between-generation variance in selection parameters, and McNamara (1995), Zheng et al. (2017), Li and Lessard (2020) for more recent studies in the context of evolutionary game theory. This paper deals with implications of these studies on inclusive fitness theory and Hamilton’s rule.
The effect of variability in payoffs on average abundance in two-player linear games under symmetric mutation
2021, Journal of Theoretical BiologyCitation Excerpt :We have shown that the average abundance of a strategy in the stationary state is driven not only by the means of the payoffs but also by their variances and covariances. These results agree with the fact that conditions for selection to favor the evolution of cooperation, favor more the evolution of cooperation than the evolution of defection, and disfavor the evolution of defection, all with respect to fixation probability in the absence of mutation, are lessened with an increase in the variances of the payoffs for defection and a decrease in the variances of the payoffs for cooperation (Li and Lessard, 2020). They agree also with the fact that the evolution of cooperation tends to be favored by selection in a large population with respect to the concepts of stochastic local stability (SLS) and stochastic evolutionary stability (SES) applied to Prisoner’s Dilemmas with random payoffs (Zheng et al., 2017; Zheng et al., 2018) if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2020).
The effect of the opting-out strategy on conditions for selection to favor the evolution of cooperation in a finite population
2021, Journal of Theoretical Biology