Randomized matrix games in a finite population: Effect of stochastic fluctuations in the payoffs on the evolution of cooperation

https://doi.org/10.1016/j.tpb.2020.04.006Get rights and content

Abstract

A diffusion approximation for a randomized 2 × 2-matrix game in a large finite population is ascertained in the case of random payoffs whose expected values, variances and covariances are of order given by the inverse of the population size N. Applying the approximation to a Randomized Prisoner’s Dilemma (RPD) with independent payoffs for cooperation and defection in random pairwise interactions, conditions on the variances of the payoffs for selection to favor the evolution of cooperation, favor more the evolution of cooperation than the evolution of defection, and disfavor the evolution of defection are deduced. All these are obtained from probabilities of ultimate fixation of a single mutant. It is shown that the conditions are lessened with an increase in the variances of the payoffs for defection against cooperation and defection and a decrease in the variances of the payoffs for cooperation against cooperation and defection. A RPD game with independent payoffs whose expected values are additive is studied in detail to support the conclusions. Randomized matrix games with non-independent payoffs, namely the RPD game with additive payoffs for cooperation and defection based on random cost and benefit for cooperation and the repeated RPD game with Tit-for-Tat and Always-Defect as strategies in pairwise interactions with a random number of rounds, are studied under the assumption that the population-scaled expected values, variances and covariances of the payoffs are all of the same small enough order. In the first model, the conditions in favor of the evolution of cooperation hold only if the covariance between the cost and the benefit is large enough, while the analysis of the second model extends the results on the effects of the variances of the payoffs for cooperation and defection found for the one-round RPD game.

Introduction

Cooperative behavior is a phenomenon that is widely observed in nature. However, natural selection tends to enhance selfish behavior through fierce competition. In order to explain the rationality of cooperation and its evolution in natural populations, a two-player game known as the Prisoner’s Dilemma (PD) has been widely studied as one of the most important theoretical frameworks (Axelrod and Hamilton, 1981, Maynard Smith, 1982, Axelrod, 1984, Poundstone, 1992, Nowak and Highfield, 2011). In an additive version of the PD game, cooperation takes the form of a donor who pays a cost c for a recipient to get a benefit b. Defection costs nothing and does not disqualify from receiving a benefit. Therefore, the payoff for cooperation never exceeds the payoff for defection (Nowak, 2006, Nowak and Sigmund, 2007). This is the case in more general versions of the PD game. Moreover, assuming random pairwise interactions in an infinite population and average payoffs as relative growth rates, the replicator equation (Taylor and Jonker, 1978) predicts global convergence to fixation of defection (Hofbauer and Sigmund, 1998).

In a finite population of constant size N undergoing discrete, non overlapping generations according to a Wright–Fisher model and more general models with exchangeable reproduction schemes (Fisher, 1930, Wright, 1931, Cannings, 1974, Ewens, 2004, Lessard, 2011), the fixation probability for a neutral mutant type represented only once initially is just the inverse of the population size, that is, N1. If this probability becomes larger than N1 in the presence of selection, then the mutant type has been said to be favored by selection (Nowak et al., 2004). Several mechanisms have been considered to explain how cooperation could be favored by natural selection assuming additive effects of average payoffs on fitness (Nowak and Sigmund, 2007). This is the case, for instance, for cooperation taking the form of the “Tit-for-Tat” strategy (Trivers, 1971, Axelrod and Hamilton, 1981, Axelrod, 1984) starting with cooperation in a repeated PD game between randomly chosen partners if the number of rounds exceeds some threshold value (Nowak et al., 2004). This is also the case in group-structured or graph-structured populations for modeling some social or geographical networks with local interactions (Ohtsuki et al., 2006). However, with a one-round PD game and constant payoffs in a well-mixed population, the fitness of cooperation never exceeds the fitness of defection, and, as a result, cooperation cannot be favored by selection.

In nature, there are changes not only in the composition of a population but also in the surrounding environment in which the population finds itself. These can affect the payoffs that individuals receive as a result of interactions with others. Randomness in evolutionary games can take several forms such as probabilistic encounter rules or mixed strategies depending or not on the replies of others (Taylor and Jonker, 1978, Eshel and Cavalli-Sforza, 1982, Hofbauer and Sigmund, 1998). Of particular interest are stochastic games which allow the environment to change in response to the players’ choices (Shapley, 1953, Fudenberg et al., 2012, Solan and Vieille, 2015, Hilbe et al., 2018). But also not to be forgotten are variations in payoffs caused by disturbances in the natural environment. These can be periodic, e.g., being seasonal or alternating day and night. But they can also be totally random as if occurring by accident (May, 1973, Kaplan et al., 1990, Lande et al., 2003). In the case of deterministically time-dependent payoffs in 2×2 matrix games, for instance, Broom (2005) compares the time average of the population state and the interior Nash equilibrium of the average payoff matrix and shows that they can be arbitrarily far apart. With periodic payoffs, even stable periodic orbits can be found from arbitrary starting points (Uyttendaele et al., 2012). On the other hand, it is shown in Stollmeier and Nagler (2018) that under the effects of random environmental noise, an evolutionary game involving two strategies with a strategy having a higher expected payoff at any frequency than the other can reach a stationary distribution with both strategies co-existing.

In a matrix game, unless stochastic fluctuations in the environment are small enough to be ignored, it is more accurate to use random payoffs than constant payoffs. In particular, the introduction of random payoffs extends the classical PD game to a randomized PD game. In order to reveal how environmental noise can generally affect the evolutionary game dynamics in an infinite population, the concepts of stochastic evolutionary stability (SES) and stochastic convergence stability (SCS) have been investigated (Zheng et al., 2017, Zheng et al., 2018). Applying these concepts to a one-round randomized PD game in a well-mixed population, it can be shown that the evolution of cooperation tends to be more easily favored by natural selection if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2019).

On the other hand, in a population genetics framework for a large finite population, Karlin and Levikson (1974) have shown that, when the mean and variance of frequency-independent genotypic fitnesses are of the same order given by the inverse of the population size, the effect of the variance matters. Actually, variability in selection, meaning fluctuating selection intensities, produces a “drift effect” away from the fixation states.

In order to study the effect of stochastic fluctuations in a context of an evolutionary game in a large finite population, we consider in this paper a matrix game with random payoffs for two players using one of two strategies. After ascertaining a diffusion approximation for this model, we focus on the Randomized Prisoner’s Dilemma (RPD) with cooperation and defection as strategies, and we consider the probability of ultimate fixation of either strategy as a single mutant. Conditions that favor the evolution of cooperation are examined in detail in the case of independent payoffs such that the average effects of cooperation and defection are additive. A RPD game with random additive effects of cooperation and defection on the payoffs as well as a repeated RPD game are also studied.

Section snippets

The model

We consider a randomized matrix game with two strategies in a finite population of fixed finite size N. The two possible pure strategies used by the individuals in the population are denoted by S1 and S2. At time t0 corresponding to some generation, the frequencies of S1 and S2 are given by x(t) and 1x(t), respectively, while their payoffs in pairwise interactions are given by the entries of the 2 × 2 random game matrix η1(t)η2(t)η3(t)η4(t).Here, η1(t) and η2(t) are the payoffs to strategy S1

Diffusion approximation

Let Δx=x(t+Δt)x(t) be the change in the frequency of individuals that use strategy S1 from time t to time t+Δt. Given x(t)=x, the first, second and fourth moments of Δx can be calculated as (see Appendix A for details) E(Δx|x(t)=x)=m(x)Δt+o(Δt), E((Δx)2|x(t)=x)=v(x)Δt+o(Δt)and E((Δx)4|x(t)=x)=o(Δt),where m(x)=x(1x)(μ2μ4+x(μ1μ2μ3+μ4)+x3(σ13σ12)+x(1x)2(2σ34σ14σ23+σ24σ22)+x2(1x)(2σ12+σ14+σ23σ13+σ32)+(1x)3(σ42σ24)) and v(x)=x(1x)(1+x3(1x)(σ12+σ322σ13)+x(1x)3(σ22+σ422σ24)+2x2(1x

Randomized Prisoner’s Dilemma (RPD)

Consider a random game matrix (1) with independent payoffs whose expected values determine a classical Prisoner’s Dilemma (PD). In this case, the population-scaled parameters in (3) verify σij=0 for i,j=1,,4 with ij, since the payoffs are uncorrelated, and μ1μ2μ3μ4=RSTPwith T>R>P>S and 2R>T+S, which defines a PD game. Then, we have a randomized Prisoner’s Dilemma (RPD) with strategies S1 and S2 corresponding to cooperation (C) and defection (D), respectively.

Suppose that cooperation is

RPD with independent payoffs

In this section, we focus on a RPD game with independent payoffs whose expected values are such that μ1μ2μ3μ4=bccb0.This payoff matrix determines an additive PD game in which cooperation (C) incurs a fixed cost c>0 to the individual adopting it, but provides a fixed benefit b>0 to the opponent, while defection (D) incurs no cost at all.

In this case, the function h(x) in (34) is given by h(x)=c. Moreover, if c1, then it can be shown (see Appendix B for details) that the function g(x) in (27)

RPD with additive payoffs

In this section, we consider a RPD game with additive payoffs. At time t0, cooperation (C) incurs a random cost c(t)>0 to the individual adopting it, but provides a random benefit b(t)>0 to the opponent, while defection (D) incurs no cost at all, so that the random payoff matrix takes the form η1(t)η2(t)η3(t)η4(t)=b(t)c(t)c(t)b(t)0.The main difference with the model in the previous section is that the payoffs are not independent.

Here, c(t) and b(t) are assumed to be random variables with E(b(

Repeated RPD

We turn now our attention to a RPD game that is repeated a random number of times. There are two pure actions, cooperation (C) and defection (D), and the payoffs in a single round of interaction between two players at time t0 are given by the random game matrix R(t)S(t)T(t)P(t).Here, R(t) and S(t) are the payoffs to action C against C and D, respectively, while T(t) and P(t) are the corresponding payoffs to action D against the same two actions. These payoffs are assumed to be independent

Discussion

Environmental noise in the payoffs of a matrix game may have important effects on the evolutionary dynamics, and even change the outcome of evolution. As a matter of fact, the dynamics is driven not only by the expected values of the payoffs but also by their variances. Variability in payoffs can push the time average of a population state far from its interior Nash equilibrium (Broom, 2005) or even change the stability of a fixation state (Stollmeier and Nagler, 2018). In the case of a

Acknowledgments

This research was supported in part by NSERC of Canada (Grant no. 8833) and Chinese Academy of Sciences President’s International Fellowship Initiative (Grant no. 2016VBA039). We thank two anonymous referees for helpful comments to improve this paper.

References (35)

  • HilbeC. et al.

    Evolution of cooperation in stochastic games

    Nature

    (2018)
  • HofbauerJ. et al.

    The Theory of Evolution and Dynamical Systems

    (1998)
  • KaplanH. et al.
  • KimuraM.

    Diffusion models in population genetics

    J. Appl. Probab.

    (1964)
  • LandeR. et al.

    Stochastic Population Dynamics in Ecology and Conservation

    (2003)
  • LessardS.
  • LiC. et al.

    Uncertainty in payoffs for defection could be conducive to the evolution of cooperative behavior

    (2019)
  • Cited by (11)

    • Evolution of cooperation with respect to fixation probabilities in multi-player games with random payoffs

      2022, Theoretical Population Biology
      Citation Excerpt :

      A stochastic version of the continuous-time replicator equation with a random noise added to the growth rate of every strategy was considered in Fudenberg and Harris (1992). Recently, the effect of stochastic changes in payoffs in discrete time was studied with particular attention to stochastic local stability of fixation states and constant polymorphic equilibria in an infinite population (Zheng et al., 2017, 2018), the fixation probability in a large population that reproduces according to a Wright–Fisher model (Li and Lessard, 2020) and the average abundance of strategies in a finite population that reproduces according to a Moran model (Kroumi and Lessard, 2021a,b). In general, variability in cost and benefit in social dilemmas introduces variances and covariances in payoffs for cooperation and defection.

    • Inclusive fitness and Hamilton's rule in a stochastic environment

      2021, Theoretical Population Biology
      Citation Excerpt :

      See also Taylor et al. (2007), Lessard (2009) for further results on inclusive fitness in finite structured populations. As for environmental stochasticity, refer to Gillespie (1973), Karlin and Levikson (1974), Karlin and Liberman (1974) for early contributions on the effect of between-generation variance in selection parameters, and McNamara (1995), Zheng et al. (2017), Li and Lessard (2020) for more recent studies in the context of evolutionary game theory. This paper deals with implications of these studies on inclusive fitness theory and Hamilton’s rule.

    • The effect of variability in payoffs on average abundance in two-player linear games under symmetric mutation

      2021, Journal of Theoretical Biology
      Citation Excerpt :

      We have shown that the average abundance of a strategy in the stationary state is driven not only by the means of the payoffs but also by their variances and covariances. These results agree with the fact that conditions for selection to favor the evolution of cooperation, favor more the evolution of cooperation than the evolution of defection, and disfavor the evolution of defection, all with respect to fixation probability in the absence of mutation, are lessened with an increase in the variances of the payoffs for defection and a decrease in the variances of the payoffs for cooperation (Li and Lessard, 2020). They agree also with the fact that the evolution of cooperation tends to be favored by selection in a large population with respect to the concepts of stochastic local stability (SLS) and stochastic evolutionary stability (SES) applied to Prisoner’s Dilemmas with random payoffs (Zheng et al., 2017; Zheng et al., 2018) if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2020).

    View all citing articles on Scopus
    View full text