Skip to main content
Log in

Hamiltonian Markov chain Monte Carlo for partitioned sample spaces with application to Bayesian deep neural nets

  • Research Article
  • Published:
Journal of the Korean Statistical Society Aims and scope Submit manuscript

Abstract

Allocating computation over multiple chains to reduce sampling time in MCMC is crucial in making MCMC more applicable in the state of the art models such as deep neural networks. One of the parallelization schemes for MCMC is partitioning the sample space to run different MCMC chains in each component of the partition (VanDerwerken and Schmidler in Parallel Markov chain Monte Carlo. arXiv:1312.7479, 2013; Basse et al. in Artificial intelligence and statistics, pp 1318–1327, 2016). In this work, we take Basse et al. (2016)’s bridge sampling approach and apply constrained Hamiltonian Monte Carlo on partitioned sample spaces. We propose a random dimension partition scheme that combines well with the constrained HMC. We empirically show that this approach can expedite MCMC sampling for any unnormalized target distribution such as Bayesian neural network in a high dimensional setting. Furthermore, in the presence of multi-modality, this algorithm is expected to be more efficient in mixing MCMC chains when proper partition elements are chosen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., & Isard, M. et al. (2016). Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Savannah, Georgia, USA.

  • Alder, B. J., & Wainwright, T. E. (1959). Studies in molecular dynamics. i. General method. The Journal of Chemical Physics, 31, 459–466.

    Article  MathSciNet  Google Scholar 

  • Basse, G., Smith, A., & Pillai, N. (2016). Parallel Markov chain Monte Carlo via spectral clustering. In Artificial intelligence and statistics (pp. 1318–1327), PMLR.

  • Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American Statistical Association.

  • Bradford, R., & Thomas, A. (1996). Markov chain monte carlo methods for family trees using a parallel processor. Statistics and Computing, 6, 67–75.

    Article  Google Scholar 

  • Brockwell, A. E. (2006). Parallel Markov chain Monte Carlo simulation by pre-fetching. Journal of Computational and Graphical Statistics, 15, 246–261.

    Article  MathSciNet  Google Scholar 

  • Byrd, J. M., Jarvis, S. A., & Bhalerao, A. H. (2008). Reducing the run-time of mcmc programs by multithreading on smp architectures. In IEEE international symposium on parallel and distributed processing (pp. 1–8). IPDPS 2008.

  • Choo, K. (2000). Learning hyperparameters for neural network models using Hamiltonian dynamics. Ph.D. thesis Citeseer.

  • Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. (2013). emcee: The mcmc hammer. Publications of the Astronomical Society of the Pacific, 125, 306.

    Article  Google Scholar 

  • Gelman, A., & Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science, 13(2), 163–185.

    Article  MathSciNet  Google Scholar 

  • Glynn, P. W., & Heidelberger, P. (1992). Analysis of initial transient deletion for parallel steady-state simulations. SIAM Journal on Scientific and Statistical Computing, 13, 904–922.

    Article  MathSciNet  Google Scholar 

  • Goodman, J., & Weare, J. (2010). Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science, 5, 65–80.

    Article  MathSciNet  Google Scholar 

  • Gronau, Q. F., Sarafoglou, A., Matzke, D., Ly, A., Boehm, U., Marsman, M., Leslie, D. S., Forster, J. J., Wagenmakers, E.-J., & Steingroever, H. (2017). A tutorial on bridge sampling. arXiv preprint arXiv:1703.05984.

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

  • Liu, J. S. (2008). Monte Carlo strategies in scientific computing. Berlin: Springer.

    MATH  Google Scholar 

  • Neal, R. M. (2012). Bayesian learning for neural networks (Vol. 118). Berlin: Springer.

    MATH  Google Scholar 

  • Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo, (pp. 113–162). Chapman and Hall/CRC.

  • Nishihara, R., Murray, I., & Adams, R. P. (2014). Parallel MCMC with generalized elliptical slice sampling. Journal of Machine Learning Research, 15, 2087–2112.

    MathSciNet  MATH  Google Scholar 

  • Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 400–407.

    Article  MathSciNet  Google Scholar 

  • Robert, C. P. (2004). Monte Carlo methods. Hoboken: Wiley.

    Book  Google Scholar 

  • Rosenthal, J. S. (2000). Parallel computing and monte carlo algorithms. Far East Journal of Theoretical Statistics, 4, 207–236.

    MathSciNet  MATH  Google Scholar 

  • Swendsen, R. H., & Wang, J.-S. (1986). Replica monte carlo simulation of spin-glasses. Physical Review Letters, 57, 2607.

    Article  MathSciNet  Google Scholar 

  • VanDerwerken, D. N., & Schmidler, S. C. (2013). Parallel markov chain monte carlo. arXiv preprint arXiv:1312.7479.

  • Wilkinson, D. J. (2006). Parallel bayesian computation. Statistics Textbooks and Monographs, 184, 477.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIT) (No. 2018R1A2A3074973).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaeyong Lee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, M., Lee, J. Hamiltonian Markov chain Monte Carlo for partitioned sample spaces with application to Bayesian deep neural nets. J. Korean Stat. Soc. 49, 139–160 (2020). https://doi.org/10.1007/s42952-019-00001-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42952-019-00001-3

Keywords

Navigation