Understanding repeat playing behavior in casual games using a Bayesian data augmentation approach

Hui, Sam K.

doi:10.1007/s11129-017-9180-2

Understanding repeat playing behavior in casual games using a Bayesian data augmentation approach

Published: 27 February 2017

Volume 15, pages 29–55, (2017)
Cite this article

Quantitative Marketing and Economics Aims and scope Submit manuscript

Sam K. Hui¹

893 Accesses
8 Citations
Explore all metrics

Abstract

With an estimated market size of nearly $18 billion in 2016, casual games (games played over social networks or mobile devices) have become increasingly popular. Because most casual games are free to install, understanding repeat playing behavior is important for game developers as it directly drives advertising revenue. Game developers are keenly interested in benchmarking their game versus the market average, and understanding how genre and various game mechanics drive repeat playing behavior. Such cross-sectional analysis, however, is difficult to conduct because individual-level data on competitors’ games are not publicly available, and that the casual gaming industry is highly fragmented with each firm making only a handful of games.

I develop a Bayesian approach, based on a parsimonious Hidden Markov Model at the individual level in conjunction with data augmentation, to study repeat playing behavior using only publicly available data. After applying the proposed approach to a sample of 379 casual games, I find that the average daily attrition rate across game is around 36.5%, with an average “play” rate of 47.9%, resulting in an average ARPU (average revenue per user) across games of around 20.5 cents. Certain genres are linked to higher attrition rates and play rates. In addition, giving out a “daily bonus” or limiting the amount of time that gamers can play each day are associated with a 17.7% and 16.4% higher ARPU, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

User-Generated Short Video Content in Social Media. A Case Study of TikTok

The Power of Digitalization: The Netflix Story

Studying Gamification: The Effect of Rewards and Incentives on Motivation

Notes

In the context of causal game playing behavior, the transition from “Death” back to “Active” is extremely unlikely. Private communication with data scientists at several major game developers reveal that once a user did not log in for 30 consecutive days, his/her probability of returning at least once over the next 30 days is lower than 1%. Thus, the probability of moving from “Death” to “Active” is negligible.
Consider M = 100 million (as discussed in the introduction, the game CityVille has an MAU of 100 million, which sets a lower bound for its potential number of players M _i ). Augmenting a year of play history of each player would require simulating 2 * 365 * 100 million =7.3 × 10¹⁰ entries. Assuming that each entry takes only 1 byte of memory to store, the two latent matrices alone would require 8000 terabytes of memory to store and process.
According to Appdata (the data provider), there are some minor non-specific measurement errors in the DAU and MAU data. For instance, in several cases I notice that the DAU and MAU values on the release date of a game are not the same (when they should be equal by definition), which Appdata ascribes to server-side recording error. Such measurement errors are assumed to occur at random.
Robustness checks on other values of σ _d and σ _m are conducted by setting their values to .025 and .1 instead of .05 here; the key results are the substantively unchanged.
For instance, if individual-level adoption probabilities are negatively correlated with attrition probabilities (i.e., early adopters are less likely to churn), later cohorts would have a higher average attrition probability than earlier cohorts, thus introducing model misspecification error since the model assumes that average attrition probability is constant across cohorts.
I have also conducted another set of simulation studies where M is set to be 1,000,000. The results are substantially unchanged and are available upon request.
See, e.g., https://twitter.com/sequoia/status/436302641992187904
Another possibility is to directly build these covariates into the model and estimate their effect jointly when estimating the model. However, this is computationally intractable because the games can no longer be estimated in parallel.

References

Barnes, D. (2010). DAU/MAU crash course – your measure of game design quality, available at web.archive.org/web/20130530064350/http://fbindie.posterous.com/daumau-crash-course-the-main-measure-of-game.
Bolton, R., Kannan, P. K., & Bramlett, M. (2000). Implications of loyalty program membership and service experiences for customer retention. Journal of the Academy of Marketing Science, 28(1), 95–108.
Article Google Scholar
Casella, G., & George, E. I. (1992). Explaining the Gibbs sampler. The American Statistician, 46(3), 167–174.
Google Scholar
Chen, Y., & Yang, S. (2007). Estimating disaggregate models using aggregate data through augmentation of individual choice. Journal of Marketing Research, 44(Nov), 613–621.
Article Google Scholar
De Vere, K. (2011). 46 cents in revenue per daily active user? available at http://www.insidemobileapps.com/2011/11/16/a-thinking-ape-interview-kenshi-arasaki/.
Fader, P. S., Hardie, B. G. S., & Shang, J. (2010). Customer-Base analysis in a discrete-time Noncontractual setting. Marketing Science, 29(6), 1086–1108.
Article Google Scholar
Galak, J., Kruger, J., & Loewenstein, G. (2011). Is variety the spice of life? It all depends on the rate of consumption. Judgment and Decision making, 6(3), 230–238.
Google Scholar
Galak, J., Kruger, J., & Loewenstein, G. (2013). Slow down! Insensitivity to rate of consumption leads to avoidable satiation. Journal of Consumer Research, 39(5), 993–1009.
Article Google Scholar
Gelman, Andrew, John B. Carlin, Hal S. Stern, and Donald B. Rubin (2013), Bayesian Data Analysis, 3rd Edition, Chapman & Hall, Boca Raton, FL.
Gupta, S., & Zeithaml, V. (2006). Customer metrics and their impact on financial performance. Marketing Science, 25(6), 718–739.
Article Google Scholar
Jiang, R., Manchanda, P., & Rossi, P. E. (2009). Bayesian analysis of random coefficient logit models using aggregate data. Journal of Econometrics, 149, 136–148.
Article Google Scholar
McCalmont, T. (2015). 15 metrics all game developers should know by heart, available at http://www.gameanalytics.com/blog/metrics-all-game-developers-should-know.html.
Montgomery, A., Li, S., Srinivasan, K., & Liechty, J. (2004). Modeling online browsing and path analysis using clickstream data. Marketing Science, 23(4), 579–595.
Article Google Scholar
Musalem, A., Bradlow, E. T., & Raju, J. S. (2008). Who’s got the coupon? Estimating consumer preferences and coupon usage from aggregate information. Journal of Marketing Research, 45(Dec), 715–730.
Article Google Scholar
Musalem, A., Bradlow, E. T., & Raju, J. S. (2009). Bayesian estimation of random-coefficients choice models using aggregate data. Journal of Applied Econometrics, 24, 490–516.
Article Google Scholar
Nelson, L. D., & Meyvis, T. (2008). Interrupted consumption: disrupting adaptation to hedonic experiences. Journal of Marketing Research, 45(December), 654–664.
Article Google Scholar
Nelson, L. D., Meyvis, T., & Galak, J. (2009). Enhancing the television viewing experience through commercial interruptions. Journal of Consumer Research, 36(August), 160–172.
Article Google Scholar
Netzer, O., Lattin, J. M., & Srinivasan, V. (2008). A hidden Markov model of customer relationship dynamics. Marketing Science, 27(2), 185–204.
Article Google Scholar
Park, S.-H., & Gupta, S. (2009). Simulated maximum likelihood estimator for the random coefficient logit model using aggregate data. Journal of Marketing Research, 46(4), 531–542.
Article Google Scholar
Playnomics (2012). Playnomics Quarterly U.S. Player Engagement Study, Q3, 2012, available at http://www.adweek.com/core/wp-content/uploads/sites/2/2012/10/Playnomics_Q3-report_Final-copy.pdf. Accessed 13 Feb 2017.
Redden, J. P. (2008). Reducing satiation: the role of categorization level. Journal of Consumer Research, 34(Feb), 624–634.
Article Google Scholar
Redden, J. P., & Galak, J. (2013). The subjective sense of feeling satiated. Journal of Experimental Psychology: General, 142(1), 209–217.
Article Google Scholar
PopCap Research (2011). 2011 popcap games social gaming research, available at www.infosolutionsgroup.com/pdfs/2011_PopCap_Social_Gaming_Research_Results.pdf.
Robert, C., & Casella, G. (2004). Monte Carlo statistical methods. New York: Springer.
Sapolsky, R., & Reynolds, M. (2006). Zebras and lions in the workplace: an interview with Dr. Robert Sapolsky. International Journal of Coaching in Organization, 4(2), 7–15.
Google Scholar
Stark, H. (2010). Facebook DAU and MAU: what they tell you (and what they don’t),” available at http://insightanalysis.wordpress.com/2010/07/21/facebook-dau-and-mau-what-they-tell-you-and-what-they-dont/.
Takahashi, D. (2016). PwC: game industry to grow nearly 5% annually through 2010,available at http://venturebeat.com/2016/06/08/the-u-s-and-global-game-industries-will-grow-a-healthy-amount-by-2020-pwc-forecasts/.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82(398), 528–540.
Article Google Scholar
Terwitte, C. (2015). Engagement benchmarks deep-dive: a detailed look at games verticals, available at https://www.adjust.com/mobile-benchmarks-q3-2015/games-verticals/.
Winkler, R. (2011). Testing the durability of Zynga’s virtual business, The Wall Street Journal, 9/28/2011, available at https://www.wsj.com/articles/SB10001424052970204422404576597070568208288. Accessed 13 Feb 2017.

Download references

Author information

Authors and Affiliations

Bauer College of Business, University of Houston, 375J Melcher Hall, Houston, TX, 77204, USA
Sam K. Hui

Authors

Sam K. Hui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sam K. Hui.

Appendix

1.1 I. MCMC computational procedure

I describe the MCMC procedure used to calibrate the model. In each iteration, I draw from the full conditional distributions of model parameters in the following order:

$ \left({S}_{(i)},{Y}_{(i)}\right),{\pi}_{i t},{\theta}_{i j},{\phi}_{i j},\left({a}_i^{\left(\pi \right)},{b}_i^{\left(\pi \right)}\right),\left({a}_i^{\left(\theta \right)},{b}_i^{\left(\theta \right)}\right),\left({a}_i^{\left(\phi \right)},{b}_i^{\left(\phi \right)}\right) $. An independent Metropolis-Hasting algorithm is used to sample each row of (S _(i), Y _(i)); given (S _(i), Y _(i)) , π _it , θ _ij , ϕ _ij, are drawn using a Gibbs sampler, and the hyperparameters $ \left({a}_i^{\left(\pi \right)},{b}_i^{\left(\pi \right)}\right),\left({a}_i^{\left(\theta \right)},{b}_i^{\left(\theta \right)}\right) $, and $ \left({a}_i^{\left(\phi \right)},{b}_i^{\left(\phi \right)}\right) $ are drawn using a random-walk Metropolis-Hastings algorithm (Gelman et al. 2013). Each step is outlined as follows.

1)
Drawing each row of (S _(i), Y _(i)):

For each representative gamer j (i.e., the j-th row in (S _(i), Y _(i))), I simulate a “proposed play history” (including both the time series of her play history and state transitions) using the HMM specified in Eq. [1]-[5] with the current draw of θ _ij and ϕ _ij. Then, I compute the likelihood of the proposed play history given Eq. [8] and [9], and accept or reject the new draw based on the Metropolis-Hastings acceptance probability (Gelman et al. 2013).

2)
Drawing π _it:

Denote the number of gamers (in the representative sample of size R) who are in the “Unaware” state at the start of the t-period by $ {n}_{it}^{(U)} $, and denote the number of transitions from the “Unaware” state to the “Active” state during the t-th period by $ {n}_{it}^{\left( U\to A\right)} $. Then, π _it can be sampled from a $ Beta\left({a}_i^{\left(\pi \right)}+{n}_{i t}^{\left( U\to A\right)},{b}_i^{\left(\pi \right)}+{n}_{i t}^{(U)}-{n}_{i t}^{\left( U\to A\right)}\right) $ distribution.

3)
Drawing θ _ij:

For gamer j, denote her total number of “Active”➔“Active” transitions by $ {n}_{ij}^{\left( A\to A\right)} $, and denote her total number of “Active”➔“Dead” transitions by $ {n}_{ij}^{\left( A\to D\right)} $ (by definition, $ {n}_{ij}^{\left( A\to D\right)} $ takes either the value of 0 or 1 since “Dead” is an absorbing state). Then, θ _ij can be sampled from a $ Beta\left({a}_i^{\left(\theta \right)}+{n}_{i j}^{\left( A\to D\right)},{b}_i^{\left(\theta \right)}+{n}_{i j}^{\left( A\to A\right)}\right) $ distribution.

4)
Drawing ϕ _ij:

Denote the number of days that gamer j stays in the “Active” state (excluding the first day when she first becomes “Active”) by $ {n}_{ij}^{(A)} $, and denote the number of days that gamer j plays the game (excluding the first day) by $ {n}_{ij}^{(P)} $. Then, ϕ _ij can be sampled from a $ Beta\left({a}_i^{\left(\phi \right)}+{n}_{i j}^{(P)},{b}_i^{\left(\phi \right)}+{n}_{i j}^{(A)}-{n}_{i j}^{(P)}\right) $ distribution.

5)
Drawing $ \left({a}_i^{\left(\pi \right)},{b}_i^{\left(\pi \right)}\right),\left({a}_i^{\left(\theta \right)},{b}_i^{\left(\theta \right)}\right) $, and $ \left({a}_i^{\left(\phi \right)},{b}_i^{\left(\phi \right)}\right) $:

Because standard conjugate computations are not available to sample the hyperparameters $ \left({a}_i^{\left(\pi \right)},{b}_i^{\left(\pi \right)}\right),\left({a}_i^{\left(\theta \right)},{b}_i^{\left(\theta \right)}\right) $, and $ \left({a}_i^{\left(\phi \right)},{b}_i^{\left(\phi \right)}\right) $, I use a random walk Metropolis-Hastings algorithm to sample from their posterior distributions. To sample $ \left({a}_i^{\left(\pi \right)},{b}_i^{\left(\pi \right)}\right) $, I use a bivariate, independent Gaussian random walk proposal distribution with the mean centered on the value of the previous draw; negative draws are “reflected” off the origin. The variance of the proposal distribution is adjusted to achieve an acceptance rate close to 50% (Gelman et al. 2013). Similarly, the hyperparameters $ \left({a}_i^{\left(\theta \right)},{b}_i^{\left(\theta \right)}\right) $ and $ \left({a}_i^{\left(\phi \right)},{b}_i^{\left(\phi \right)}\right) $ can be sampled using an analogous procedure.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hui, S.K. Understanding repeat playing behavior in casual games using a Bayesian data augmentation approach. Quant Mark Econ 15, 29–55 (2017). https://doi.org/10.1007/s11129-017-9180-2

Download citation

Received: 11 July 2016
Accepted: 30 January 2017
Published: 27 February 2017
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11129-017-9180-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Understanding repeat playing behavior in casual games using a Bayesian data augmentation approach

Abstract

Access this article

Similar content being viewed by others

User-Generated Short Video Content in Social Media. A Case Study of TikTok

The Power of Digitalization: The Netflix Story

Studying Gamification: The Effect of Rewards and Incentives on Motivation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 I. MCMC computational procedure

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Understanding repeat playing behavior in casual games using a Bayesian data augmentation approach

Abstract

Access this article

Similar content being viewed by others

User-Generated Short Video Content in Social Media. A Case Study of TikTok

The Power of Digitalization: The Netflix Story

Studying Gamification: The Effect of Rewards and Incentives on Motivation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 I. MCMC computational procedure

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation