1932

Abstract

Markov chain Monte Carlo (MCMC) is an essential set of tools for estimating features of probability distributions commonly encountered in modern applications. For MCMC simulation to produce reliable outcomes, it needs to generate observations representative of the target distribution, and it must be long enough so that the errors of Monte Carlo estimates are small. We review methods for assessing the reliability of the simulation effort, with an emphasis on those most useful in practically relevant settings. Both strengths and weaknesses of these methods are discussed. The methods are illustrated in several examples and in a detailed case study.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040220-090158
2022-03-07
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/statistics/9/1/annurev-statistics-040220-090158.html?itemId=/content/journals/10.1146/annurev-statistics-040220-090158&mimeType=html&fmt=ahah

Literature Cited

  1. Agarwal M, Vats D. 2020. Globally-centered autocovariances in MCMC. arXiv:2009.01799 [stat.CO]
  2. Andrews DWK. 1991. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59:817–58
    [Google Scholar]
  3. Archila FHA. 2016. Markov chain Monte Carlo for linear mixed models PhD Thesis Univ. Minn. Minneapolis, MN:
  4. Atchadé YF. 2011. Kernel estimators of asymptotic variance for adaptive Markov chain Monte Carlo. Ann. Stat. 39:990–1011
    [Google Scholar]
  5. Athreya KB, Roy V. 2014. Monte Carlo methods for improper target distributions. Electron. J. Stat. 8:2664–92
    [Google Scholar]
  6. Baxendale PH. 2005. Renewal theory and computable convergence rates for geometrically ergodic Markov chains. Ann. Appl. Probab. 15:700–38
    [Google Scholar]
  7. Biswas N, Jacob PE, Vanetti P 2019. Estimating convergence of Markov chains with L-lag couplings. Advances in Neural Information Processing Systems, Vol. 32 H Wallach, H Larochelle, A Beygelzimer, F d'Alché-Buc, E Fox, R Garnett Red Hook, NY: Curran
    [Google Scholar]
  8. Bou-Rabee N, Eberle A, Zimmer R 2020. Coupling and convergence for Hamiltonian Monte Carlo. Ann. Appl. Probab. 30:1209–50
    [Google Scholar]
  9. Brooks S, Gelman A, Jones GL, Meng XL. 2011. Handbook of Markov Chain Monte Carlo Boca Raton, FL: CRC Press
  10. Brooks SP, Gelman A. 1998. General methods for monitoring convergence of iterative simulations. J. Comput. Graph. Stat. 7:434–55
    [Google Scholar]
  11. Burdzy K, Kendall WS. 2000. Efficient Markovian couplings: examples and counterexamples. Ann. Appl. Probab. 10:362–409
    [Google Scholar]
  12. Chakraborty S, Bhattacharya SK, Khare K. 2019. Estimating accuracy of the MCMC variance estimator: a central limit theorem for batch means estimators. arXiv:1911.00915 [stat.CO]
  13. Conway RW. 1963. Some tactical problems in digital simulation. Manag. Sci. 10:47–61
    [Google Scholar]
  14. Craiu RV, Meng XL. 2021. Double happiness: enhancing the coupled gains of L-lag coupling via control variates. Stat. Sin. In press
    [Google Scholar]
  15. Dai N, Jones GL. 2017. Multivariate initial sequence estimators in Markov chain Monte Carlo. J. Multivar. Anal. 159:184–99
    [Google Scholar]
  16. Dalalyan AS, Riou-Durand L. 2020. On sampling from a log-concave density using kinetic Langevin diffusions. Bernoulli 26:1956–88
    [Google Scholar]
  17. Damerdji H. 1991. Strong consistency and other properties of the spectral variance estimator. Manag. Sci. 37:1424–40
    [Google Scholar]
  18. Damerdji H. 1994. Strong consistency of the variance estimator in steady-state simulation output analysis. Math. Operat. Res. 19:494–512
    [Google Scholar]
  19. Davis B, Hobert JP 2020. On the convergence complexity of Gibbs samplers for a family of simple Bayesian random effects models. Methodol. Comput. Appl. Probab. https://doi.org/10.1007/s11009-020-09808-8
    [Crossref] [Google Scholar]
  20. Diaconis P, Khare K, Saloff-Coste L. 2008. Gibbs sampling, exponential families, and orthogonal polynomials. Stat. Sci. 23:151–78
    [Google Scholar]
  21. Diaconis P, Saloff-Coste L. 1996. Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 6:695–750
    [Google Scholar]
  22. Diaconis P, Stroock D. 1991. Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1:36–61
    [Google Scholar]
  23. Dixit A, Roy V. 2021. Posterior impropriety of some sparse Bayesian learning models. Stat. Probab. Lett. 171:109039
    [Google Scholar]
  24. Doss CR, Flegal JM, Jones GL, Neath RC. 2014. Markov chain Monte Carlo estimation of quantiles. Electron. J. Stat. 8:2448–78
    [Google Scholar]
  25. Doss H, Hobert JP. 2010. Estimation of Bayes factors in a class of hierarchical random effects models using a geometrically ergodic MCMC algorithm. J. Comput. Graph. Stat. 19:295–312
    [Google Scholar]
  26. Douc R, Fort G, Moulines E, Soulier P 2004a. Practical drift conditions for subgeometric rates of convergence. Ann. Appl. Probab. 14:1353–77
    [Google Scholar]
  27. Douc R, Moulines E, Priouret P, Soulier P. 2018. Markov Chains New York: Springer
  28. Douc R, Moulines E, Rosenthal JS. 2004b. Quantitative bounds on convergence of time-inhomogeneous Markov chains. Ann. Appl. Probab. 14:1643–65
    [Google Scholar]
  29. Durmus A, Moulines E, Saksman E 2020. Irreducibility and geometric ergodicity of Hamiltonian Monte Carlo. Ann. Stat. 48:3545–64
    [Google Scholar]
  30. Dwivedi R, Chen Y, Wainwright MJ, Yu B 2019. Log-concave sampling: Metropolis-Hastings algorithms are fast. J. Mach. Learn. Theory 20:183
    [Google Scholar]
  31. Eberle A, Guillin A, Zimmer R. 2019. Couplings and quantitative contraction rates for Langevin dynamics. Ann. Probab. 47:1982–2010
    [Google Scholar]
  32. Ekvall KO, Jones GL. 2021. Convergence analysis of a collapsed Gibbs sampler for Bayesian vector autoregressions. Electron. J. Stat. 15:691–721
    [Google Scholar]
  33. Flegal JM, Gong L. 2015. Relative fixed-width stopping rules for Markov chain Monte Carlo simulations. Stat. Sin. 25:655–76
    [Google Scholar]
  34. Flegal JM, Haran M, Jones GL. 2008. Markov chain Monte Carlo: Can we trust the third significant figure?. Stat. Sci. 23:250–60
    [Google Scholar]
  35. Flegal JM, Herbei R. 2012. Exact sampling for intractable probability distributions via a Bernoulli factory. Electron. J. Stat. 6:10–37
    [Google Scholar]
  36. Flegal JM, Hughes J, Vats D, Dai N. 2020. mcmcse: Monte Carlo standard errors for MCMC. R Package, version 1.4-1. https://CRAN.R-project.org/package=mcmcse
  37. Flegal JM, Jones GL. 2010. Batch means and spectral variance estimators in Markov chain Monte Carlo. Ann. Stat. 38:1034–70
    [Google Scholar]
  38. Fort G, Moulines E. 2000. V-subgeometric ergodicity for a Hastings-Metropolis algorithm. Stat. Probab. Lett. 49:401–10
    [Google Scholar]
  39. Gelfand AE, Smith AFM. 1990. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85:398–409
    [Google Scholar]
  40. Gelman A, Rubin DB. 1992. Inference from iterative simulation using multiple sequences (with discussion). Stat. Sci. 7:457–72
    [Google Scholar]
  41. Geyer CJ 1991. Markov chain Monte Carlo maximum likelihood. Computing Science and Statistics: Proceedings of 23rd Symposium on the Interface, Fairfax Station, 1991 EM Keramidas, SM Kaufman 156–63 Fairfax Station, VA: Interface Found. N. Am .
    [Google Scholar]
  42. Geyer CJ. 1992. Practical Markov chain Monte Carlo (with discussion). Stat. Sci. 7:473–511
    [Google Scholar]
  43. Geyer CJ 2011. Introduction to MCMC. Handbook of Markov Chain Monte Carlo S Brooks, A Gelman, XL Meng, GL Jones 1–48 Boca Raton, FL: Chapman & Hall
    [Google Scholar]
  44. Glynn PW, Iglehart DL. 1987. A joint central limit theorem for the sample mean and regenerative variance estimator. Ann. Oper. Res. 8:41–55
    [Google Scholar]
  45. Glynn PW, Iglehart DL. 1993. Conditions for the applicability of the regenerative method. Manag. Sci. 39:1108–11
    [Google Scholar]
  46. Gupta K, Vats D. 2020. Estimating Monte Carlo variance from multiple Markov chains. arXiv:2007.04229 [stat.ME]
  47. Hairer M, Mattingly JC, Scheutzow M. 2011. Asymptotic coupling and a general form of Harris' theorem with applications to stochastic delay equations. Probab. Theory Relat. Fields 149:223–59
    [Google Scholar]
  48. Herbei R, McKeague IW. 2009. Hybrid samplers for ill-posed inverse problems. Scand. . J. Stat. 36:839–53
    [Google Scholar]
  49. Hilbe JM, de Souza RS, Ishida EEO. 2017. Bayesian Models for Astrophysical Data Cambridge, UK: Cambridge Univ. Press
  50. Hobert JP 2011. The data augmentation algorithm: theory and methodology. Handbook of Markov Chain Monte Carlo S Brooks, A Gelman, XL Meng, GL Jones 253–93 Boca Raton, FL: Chapman & Hall
    [Google Scholar]
  51. Hobert JP, Casella G 1996. The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J. Am. Stat. Assoc. 91:1461–73
    [Google Scholar]
  52. Hobert JP, Geyer CJ. 1998. Geometric ergodicity of Gibbs and block Gibbs samplers for a hierarchical random effects model. J. Multivar. Anal. 67:414–30
    [Google Scholar]
  53. Hobert JP, Jones GL, Presnell B, Rosenthal JS 2002. On the applicability of regenerative simulation in Markov chain Monte Carlo. Biometrika 89:731–43
    [Google Scholar]
  54. Hobert JP, Robert CP, Goutis C. 1997. Connectedness conditions for the convergence of the Gibbs sampler. Stat. Probab. Lett. 33:235–40
    [Google Scholar]
  55. Huber ML. 2016. Perfect Simulation Boca Raton, FL: Chapman and Hall/CRC
  56. Jacob PE, O'Leary J, Atchadé Y. 2020. Unbiased Markov chain Monte Carlo methods with couplings. J. R. Stat. Soc. Ser. B 82:543–600
    [Google Scholar]
  57. Jarner SF, Hansen E. 2000. Geometric ergodicity of Metropolis algorithms. Stochast. Proc. Appl. 85:341–61
    [Google Scholar]
  58. Jerison DC. 2019. Quantitative convergence rates for reversible Markov chains via strong random times. arXiv:1908.06459 [math.PR]
  59. Jiang YH, Liu T, Lou Z, Rosenthal JS, Shangguan S et al. 2020. MCMC confidence intervals and biases. arXiv:2012.02816 [math.ST]
  60. Johndrow JE, Smith A, Pillai N, Dunson DB. 2019. MCMC for imbalanced categorical data. J. Am. Stat. Assoc. 114:1394–403
    [Google Scholar]
  61. Johnson AA, Jones GL. 2015. Geometric ergodicity of random scan Gibbs samplers for hierarchical one-way random effects models. J. Multivar. Anal. 140:325–42
    [Google Scholar]
  62. Johnson AA, Jones GL, Neath RC. 2013. Component-wise Markov chain Monte Carlo: uniform and geometric ergodicity under mixing and composition. Stat. Sci. 28:360–75
    [Google Scholar]
  63. Johnson LT, Geyer CJ 2012. Variable transformation to obtain geometric ergodicity in the random-walk Metropolis algorithm. Ann. Stat. 40:3050–76
    [Google Scholar]
  64. Jones GL. 2004. On the Markov chain central limit theorem. Probab. Surv. 1:299–320
    [Google Scholar]
  65. Jones GL, Haran M, Caffo BS, Neath R. 2006. Fixed-width output analysis for Markov chain Monte Carlo. J. Am. Stat. Assoc. 101:1537–47
    [Google Scholar]
  66. Jones GL, Hobert JP. 2001. Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Stat. Sci. 16:312–34
    [Google Scholar]
  67. Jones GL, Hobert JP. 2004. Sufficient burn-in for Gibbs samplers for a hierarchical random effects model. Ann. Stat. 32:784–817
    [Google Scholar]
  68. Jourdain B, Lelièvre T, Miasojedow B 2014. Optimal scaling for the transient phase of Metropolis Hastings algorithms: the longtime behavior. Bernoulli 20:1930–78
    [Google Scholar]
  69. Khare K, Hobert JP. 2013. Geometric ergodicity of the Bayesian lasso. Electron. J. Stat. 7:2150–63
    [Google Scholar]
  70. Kosorok MR. 2000. Monte Carlo error estimation for multivariate Markov chains. Stat. Probab. Lett. 46:85–93
    [Google Scholar]
  71. Link WA, Eaton MJ. 2012. On thinning of chains in MCMC. Methods Ecol. Evol. 3:112–15
    [Google Scholar]
  72. Liu JS. 1996. Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Stat. Comput. 6:113–19
    [Google Scholar]
  73. Liu JS, Wong WH, Kong A. 1994. Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81:27–40
    [Google Scholar]
  74. Liu JS, Wong WH, Kong A. 1995. Covariance structure and convergence rate of the Gibbs sampler with various scans. J. R. Stat. Soc. Ser. B 57:157–69
    [Google Scholar]
  75. Liu Y, Flegal JM. 2018. Weighted batch means estimators in Markov chain Monte Carlo. Electron. J. Stat. 12:3397–442
    [Google Scholar]
  76. Liu Y, Vats D, Flegal JM. 2021. Batch size selection for variance estimators in MCMC. Methodol. Comput. Appl. Probab. https://doi.org/10.1007/s11009-020-09841-7
    [Crossref] [Google Scholar]
  77. Livingstone S, Betancourt M, Byrne S, Girolami M. 2019. On the geometric ergodicity of Hamiltonian Monte Carlo. Bernoulli 25:3109–38
    [Google Scholar]
  78. MacEachern SN, Berliner LM. 1994. Subsampling the Gibbs sampler. Am. Stat. 48:188–90
    [Google Scholar]
  79. Mangoubi O, Smith A. 2017. Rapid mixing of Hamiltonian Monte Carlo on strongly log-concave distributions. arXiv:1708.07114 [math.PR]
  80. Marchev D, Hobert JP. 2004. Geometric ergodicity of van Dyk and Meng's algorithm for the multivariate Student's t model. J. Am. Stat. Assoc. 99:228–38
    [Google Scholar]
  81. Masters KL, Mosleh M, Romer AK, Nichol RC, Bamford SP et al. 2010. Galaxy Zoo: passive red spirals. MNRAS 405:783–99
    [Google Scholar]
  82. Mengersen K, Tweedie RL. 1996. Rates of convergence of the Hastings and Metropolis algorithms. Ann. Stat. 24:101–21
    [Google Scholar]
  83. Meyn S, Tweedie R. 2009. Markov Chains and Stochastic Stability, Vol. 2 Cambridge, UK: Cambridge Univ. Press
  84. Mykland P, Tierney L, Yu B 1995. Regeneration in Markov chain samplers. J. Am. Stat. Assoc. 90:233–41
    [Google Scholar]
  85. Owen AB. 2017. Statistically efficient thinning of a Markov chain sampler. J. Comput. Graph. Stat. 26:738–44
    [Google Scholar]
  86. Pal S, Khare K, Hobert JP 2017. Trace class Markov chains for Bayesian inference with generalized double Pareto shrinkage priors. Scand. . J. Stat. 44:2307–23
    [Google Scholar]
  87. Park T, Casella G. 2008. The Bayesian lasso. J. Am. Stat. Assoc. 103:681–86
    [Google Scholar]
  88. Qin Q, Hobert JP. 2019. Convergence complexity analysis of Albert and Chib's algorithm for Bayesian probit regression. Ann. Stat. 47:2320–47
    [Google Scholar]
  89. Qin Q, Hobert JP. 2021a. On the limitations of single-step drift and minorization in Markov chain convergence analysis. Ann. Appl. Probab. 31:163359
    [Google Scholar]
  90. Qin Q, Hobert JP. 2021b. Wasserstein-based methods for convergence complexity analysis of MCMC with applications. Ann. Appl. Probab. In press
    [Google Scholar]
  91. Qin Q, Jones GL. 2021. Convergence rates of two-component MCMC samplers. Bernoulli. In press
    [Google Scholar]
  92. Rajaratnam B, Sparks D, Khare K, Zhang L 2015. Scalable Bayesian shrinkage and uncertainty quantification for high-dimensional regression. arXiv:1509.03697 [stat.ME]
  93. Roberts GO. 1999. A note on acceptance rate criteria for CLTs for Metropolis-Hastings algorithms. J. Appl. Probab. 36:1210–17
    [Google Scholar]
  94. Roberts GO, Gelman A, Gilks WR. 1997. Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7:110–20
    [Google Scholar]
  95. Roberts GO, Polson NG. 1994. On the geometric convergence of the Gibbs sampler. J. R. Stat. Soc. Ser. B 56:377–84
    [Google Scholar]
  96. Roberts GO, Rosenthal JS. 2001. Optimal scaling for various Metropolis-Hastings algorithms. Stat. Sci. 16:351–67
    [Google Scholar]
  97. Roberts GO, Rosenthal JS. 2004. General state space Markov chains and MCMC algorithms. Probab. Surv. 1:20–71
    [Google Scholar]
  98. Roberts GO, Rosenthal JS. 2006. Harris recurrence of Metropolis-within-Gibbs and trans-dimensional Markov chains. Ann. Appl. Probab. 16:2123–39
    [Google Scholar]
  99. Roberts GO, Tweedie RL. 1996. Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika 83:95–110
    [Google Scholar]
  100. Roberts GO, Tweedie RL. 1999. Bounds on regeneration times and convergence rates for Markov chains. Stochast. Proc. Appl. 80:211–29
    [Google Scholar]
  101. Robertson N, Flegal JM, Vats D, Jones GL. 2020. Assessing and visualizing simultaneous simulation error. J. Comput. Graph. Stat. 30:324–34
    [Google Scholar]
  102. Rosenthal JS. 1995. Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Am. Stat. Assoc. 90:558–66
    [Google Scholar]
  103. Rosenthal JS. 2017. Simple confidence intervals for MCMC without CLTs. Electron. J. Stat. 11:211–14
    [Google Scholar]
  104. Roy V. 2020. Convergence diagnostics for Markov chain Monte Carlo. Annu. Rev. Stat. Appl. 7:387–412
    [Google Scholar]
  105. Seila AF. 1982. Multivariate estimation in regenerative simulation. Oper. Res. Lett. 1:153–56
    [Google Scholar]
  106. Tan A, Hobert JP 2009. Block Gibbs sampling for Bayesian random effects models with improper priors: convergence and regeneration. J. Comput. Graph. Stat. 18:861–78
    [Google Scholar]
  107. Tan A, Jones GL, Hobert JP 2013. On the geometric ergodicity of two-variable Gibbs samplers. Advances in Modern Statistical Theory and Applications: A Festschrift in Honor of Morris L. Eaton GL Jones, X Shen 25–42 Beachwood, OH: Inst. Math. Stat.
    [Google Scholar]
  108. Tierney L. 1994. Markov chains for exploring posterior distributions (with discussion). Ann. Stat. 22:1701–62
    [Google Scholar]
  109. Vats D, Flegal JM. 2020. Lugsail lag windows for estimating time-average covariance matrices. arXiv:1809.04541 [stat.CO]
  110. Vats D, Flegal JM, Jones GL. 2018. Strong consistency of multivariate spectral variance estimators in Markov chain Monte Carlo. Bernoulli 24:1860–909
    [Google Scholar]
  111. Vats D, Flegal JM, Jones GL. 2019. Multivariate output analysis for Markov chain Monte Carlo. Biometrika 106:321–37
    [Google Scholar]
  112. Vats D, Jones GL. 2020. Comment: unbiased Markov chain Monte Carlo with couplings by Pierre E. Jacob, John O'Leary, Yves F. Atchadé. J. R. Stat. Soc. Ser. B 82:593
    [Google Scholar]
  113. Vats D, Knudson C. 2021. Revisiting the Gelman-Rubin diagnostic. Stat. Sci. 36:51829
    [Google Scholar]
  114. Wang G. 2020. Exact convergence rate analysis of the independence Metropolis-Hastings algorithms. arXiv:2008.02455 [math.PR]
  115. Wang X, Roy V 2018. Convergence analysis of the block Gibbs sampler for Bayesian probit linear mixed models with improper priors. Electron. J. Stat. 12:4412–39
    [Google Scholar]
  116. Yang J, Roberts GO, Rosenthal JS. 2019. Optimal scaling of Metropolis algorithms on general target distributions. arXiv:1904.12157 [stat.CO]
  117. Yang J, Rosenthal JS 2017. Complexity results for MCMC derived from quantitative bounds. arXiv:1708.00829 [stat.CO]
  118. Yang Y, Wainwright MJ, Jordan MI 2016. On the computational complexity of high-dimensional Bayesian variable selection. Ann. Stat. 44:2497–532
    [Google Scholar]
  119. Zhang L, Khare K, Xing Z 2019. Trace class Markov chains for the Normal-Gamma Bayesian shrinkage model. Electron. J. Stat. 13:166–207
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-040220-090158
Loading
/content/journals/10.1146/annurev-statistics-040220-090158
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error