A fast and calibrated computer model emulator: an empirical Bayes approach

Kejzlar, Vojtech; Son, Mookyong; Bhattacharya, Shrijita; Maiti, Tapabrata

doi:10.1007/s11222-021-10024-8

A fast and calibrated computer model emulator: an empirical Bayes approach

Published: 22 June 2021

Volume 31, article number 49, (2021)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

399 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Mathematical models implemented on a computer have become the driving force behind the acceleration of the cycle of scientific processes. This is because computer models are typically much faster and economical to run than physical experiments. In this work, we develop an empirical Bayes approach to predictions of physical quantities using a computer model, where we assume that the computer model under consideration needs to be calibrated and is computationally expensive. We propose a Gaussian process emulator and a Gaussian process model for the systematic discrepancy between the computer model and the underlying physical process. This allows for closed-form and easy-to-compute predictions given by a conditional distribution induced by the Gaussian processes. We provide a rigorous theoretical justification of the proposed approach by establishing posterior consistency of the estimated physical process. The computational efficiency of the methods is demonstrated in an extensive simulation study and a real data example. The newly established approach makes enhanced use of computer models both from practical and theoretical standpoints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian Process-Based Sensitivity Analysis and Bayesian Model Calibration with GPMSA

Improving Gaussian Process Emulators with Boundary Information

References

Amewou-Atisso, M., Ghosal, S., Ghosh, J.K., Ramamoorthi, R.: Posterior consistency for semi-parametric regression problems. Bernoulli 9(2), 291–312 (2003). https://doi.org/10.3150/bj/1068128979
Article MathSciNet MATH Google Scholar
Audi, G., Wapstra, A., Thibault, C.: The AME2003 atomic mass evaluation: (ii). tables, graphs and references. Nuclear Physics A 729, 337–676 (2003), URL http://www.sciencedirect.com/science/article/pii/S0375947403018098
Bayarri, M.J., Berger, J.O., Paulo, R., Sacks, J., Cafeo, J.A., Cavendish, J., Lin, C.H., Tu, J.: A framework for validation of computer models. Technometrics 49, 138–154 (2007). https://doi.org/10.1198/004017007000000092
Article MathSciNet Google Scholar
Belitser, E., Enikeeva, F.: Empirical bayesian test of the smoothness. Math. Methods of Stat. 17(1), 1–18 (2008). https://doi.org/10.3103/S1066530708010018
Article MathSciNet MATH Google Scholar
Benzaid, D., Bentridi, S., Kerraci, A., Amrani, N.: Bethe-Weizsäcker semiempirical mass formula coefficients 2019 update based on AME2016. Nucl. Sci. Tech. 31, 9 (2020). https://doi.org/10.1007/s41365-019-0718-8
Article Google Scholar
Bertsch, G.F., Bingham, D.: Estimating parameter uncertainty in binding-energy models by the frequency-domain bootstrap. Physical Review Letters 119, 252501 (2017), URL https://link.aps.org/doi/10.1103/PhysRevLett.119.252501
Bethe, H.A., Bacher, R.F.: Nuclear physics a. stationary states of nuclei. Rev Mod Phys 8, 82–229 (1936), https://doi.org/10.1103/RevModPhys.8.82, URL https://link.aps.org/doi/10.1103/RevModPhys.8.82
Brynjarsdóttir, J., O’Hagan, A.: Learning about physical parameters: the importance of model discrepancy. Inverse Problems 30, 114007 (2014). https://doi.org/10.1088%2F0266-5611%2F30%2F11%2F114007
Choi, T.: Posterior consistency in nonparametric regression problems under gaussian process priors (2005)
Choi, T.: Alternative posterior consistency results in nonparametric binary regression using gaussian process priors. Journal of Statistical Planning and Inference 137(9), 2975–2983 (2007). https://doi.org/10.1016/j.jspi.2006.11.001, URL http://www.sciencedirect.com/science/article/pii/S0378375807000407
Choi, T., Schervish, M.J.: On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis 98(10), 1969–1987 (2007). https://doi.org/10.1016/j.jmva.2007.01.004, URL http://www.sciencedirect.com/science/article/pii/S0047259X07000048
Fayans, S.A.: Towards a universal nuclear density functional. J Exp. Theor. Phy. Lett. 68(3), 169–174 (1998). https://doi.org/10.1134/1.567841
Article Google Scholar
Florens, J.P., Simoni, A.: Regularized posteriors in linear ill-posed inverse problems. Scandinavian Journal of Statistics 39(2), 214–235 (2012). https://doi.org/10.1111/j.1467-9469.2011.00784.x, URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9469.2011.00784.x
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., Rubin, D.: Bayesian Data Analysis, 3rd edn. CRC Pres (2013), URL https://books.google.com/books?id=ZXL6AQAAQBAJ
Ghosal, S., Roy, A.: Posterior consistency of gaussian process prior for nonparametric binary regression. Ann. Statist. 34(5), 2413–2429 (2006). https://doi.org/10.1214/009053606000000795
Article MathSciNet MATH Google Scholar
Gu, M., Wang, L.: Scaled Gaussian stochastic process for computer model calibration and prediction. SIAM/ASA J Uncertain. Quantif. 6(4), 1555–1583 (2018). https://doi.org/10.1137/17M1159890
Article MathSciNet MATH Google Scholar
Higdon, D., Gattiker, J., Williams, B., Rightley, M.: Computer model calibration using high-dimensional output. Journal of the American Statistical Association 103, 570–583 (2008), URL http://www.jstor.org/stable/27640080
Higdon, D., McDonnell, J.D., Schunck, N., Sarich, J., Wild, S.M.: A Bayesian approach for parameter estimation and prediction using a computationally intensive model. J Phy. G: Nuclear Part. Phys. 42(3), 034009 (2015). https://doi.org/10.1088/0954-3899/42/3/034009
Article Google Scholar
Kejzlar, V., Maiti, T.: Variational inference with vine copulas: An efficient approach for bayesian computer model calibration (2020). 2003.12890
Kejzlar, V., Neufcourt, L., Nazarewicz, W., Reinhard, P.G.: Statistical aspects of nuclear mass models. Journal of Physics G: Nuclear and Particle Physics 47(9), 094001 (2020). https://doi.org/10.1088/1361-6471/ab907c, URL https://doi.org/10.1088%2F1361-6471%2Fab907c
Kennedy, M.C., O’Hagan, A.: Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 425–464 (2001). https://doi.org/10.1111/1467-9868.00294, URL https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00294
King, G.B., Lovell, A.E., Neufcourt, L., Nunes, F.M.: Direct comparison between Bayesian and frequentist uncertainty quantification for nuclear reactions. Phy. Rev. Lett. 122, 232502 (2019)
Article Google Scholar
Kirson, M.W.: Mutual influence of terms in a semi-empirical mass formula. Nucl Phys A 798(1), 29–60 (2008). https://doi.org/10.1016/j.nuclphysa.2007.10.011, URL http://www.sciencedirect.com/science/article/pii/S0375947407007531
Knapik, B., Szabo, B., Vaart, A., Zanten, J.: Bayes procedures for adaptive inference in inverse problems for the white noise model. Probability Theory and Related Fields (2016). https://doi.org/10.1007/s00440-015-0619-7
Kortelainen, M., Lesinski, T., Moré, J.J., Nazarewicz, W., Sarich, J., Schunck, N., Stoitsov, M.V., Wild, S.M.: Nuclear energy density optimization. Phy. Rev.C 82(2), 024313 (2010). https://doi.org/10.1103/PhysRevC.82.024313
Article Google Scholar
Kortelainen, M., McDonnell, J., Nazarewicz, W., Reinhard, P.G., Sarich, J., Schunck, N., Stoitsov, M.V., Wild, S.M.: Nuclear energy density optimization: large deformations. Phy. Rev. C 85, 024304 (2012). https://doi.org/10.1103/PhysRevC.85.024304
Article Google Scholar
Kortelainen, M., McDonnell, J., Nazarewicz, W., Olsen, E., Reinhard, P.G., Sarich, J., Schunck, N., Wild, S.M., Davesne, D., Erler, J., Pastore, A.: Nuclear energy density optimization: Shell structure. Phys Rev C 89, 054314 (2014). https://doi.org/10.1103/PhysRevC.89.054314
Article Google Scholar
Krane, K.: Introductory Nuclear Physics. Wiley (1987), URL https://books.google.com/books?id=ConwAAAAMAAJ
Martino, L., Laparra, V., Camps-Valls, G.: Probabilistic cross-validation estimators for gaussian process regression. In: 2017 25th European Signal Processing Conference (EUSIPCO), pp. 823–827 (2017)
McDonnell, J.D., Schunck, N., Higdon, D., Sarich, J., Wild, S.M., Nazarewicz, W.: Uncertainty quantification for nuclear density functional theory and information content of new measurements. Phy. Rev. Lett. 114(12), 122501 (2015). https://doi.org/10.1103/PhysRevLett.114.122501
Article Google Scholar
Morris, M.D., Mitchell, T.J.: Exploratory designs for computational experiments. Journal of Statistical Planning and Inference 43(3), 381–402 (1995). https://doi.org/10.1016/0378-3758(94)00035-T, URL http://www.sciencedirect.com/science/article/pii/037837589400035T
Myers, W.D., Swiatecki, W.J.: Nuclear masses and deformations. Nucl Phys 81(2), 1–60 (1966). https://doi.org/10.1016/S0029-5582(66)80001-9, URL http://www.sciencedirect.com/science/article/pii/S0029558266800019
Neufcourt, L., Cao, Y., Nazarewicz, W., Viens, F.: Bayesian approach to model-based extrapolation of nuclear observables. Physical Review C 98, 034318 (2018), URL https://link.aps.org/doi/10.1103/PhysRevC.98.034318
Neufcourt, L., Cao, Y., Nazarewicz, W., Olsen, E., Viens, F.: Neutron drip line in the Ca region from Bayesian model averaging. Physical Review Letters 122, 062502 (2019), URL https://link.aps.org/doi/10.1103/PhysRevLett.122.062502
Plumlee, M.: Bayesian calibration of inexact computer models. J Am. Stat. Assoc. 112, 1274–1285 (2017). https://doi.org/10.1080/01621459.2016.1211016
Article MathSciNet Google Scholar
Plumlee, M.: Computer model calibration with confidence and consistency. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 81(3), 519–545 (2019). https://doi.org/10.1111/rssb.12314, URL https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12314
Plumlee, M., Joseph, V.R., Yang, H.: Calibrating functional parameters in the ion channel models of cardiac cells. J Am. Stat. Assoc. 111, 500–509 (2016)
Article MathSciNet Google Scholar
Pollard, D., Chang, W., Haran, M., Applegate, P., DeConto, R.: Large ensemble modeling of the last deglacial retreat of the West Antarctic Ice Sheet: comparison of simple and advanced statistical techniques. Geosci. Model Dev. 9(5), 1697–1723 (2016)
Article Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA (2006), URL www.GaussianProcess.org/gpml
Reinhard, P.G., Bender, M., Nazarewicz, W., Vertse, T.: From finite nuclei to the nuclear liquid drop: Leptodermous expansion based on self-consistent mean-field theory. Phys Rev C 73, 014309 (2006). https://doi.org/10.1103/PhysRevC.73.014309, URL https://link.aps.org/doi/10.1103/PhysRevC.73.014309
Rousseau, J., Szabo, B.: Asymptotic behaviour of the empirical bayes posteriors associated to maximum marginal likelihood estimator. Ann. Statist. 45(2), 833–865 (2017). https://doi.org/10.1214/16-AOS1469
Article MathSciNet MATH Google Scholar
Schaeffer, D.G., Cain, J.W.: Nonlinear Systems: Local Theory, Springer New York, New York, NY, pp. 79–109 (2016). https://doi.org/10.1007/978-1-4939-6389-8_3
Schunck, N., O’Neal, J., Grosskopf, M., Lawrence, E., Wild, S.M.: Calibration of energy density functionals with deformed nuclei. Journal of Physics G: Nuclear and Particle Physics (2020). URL http://iopscience.iop.org/10.1088/1361-6471/ab8745
Serra, P., Krivobokova, T.: Adaptive empirical bayesian smoothing splines. Bayesian Anal. 12(1), 219–238 (2017). https://doi.org/10.1214/16-BA997
Article MathSciNet MATH Google Scholar
Sexton, D.M.H., Murphy, J.M., Collins, M., Webb, M.J.: Multivariate probabilistic projections using imperfect climate models Part i: outline of methodology. Climate Dyn. 38(11), 2513–2542 (2012)
Article Google Scholar
Shiryaev, A.N.: Convergence of Probability Measures. Central Limit Theorem, Springer New York, New York, NY, pp. 308–378 (1996). https://doi.org/10.1007/978-1-4757-2539-1_4
Sniekers, S., van der Vaart, A.: Adaptive bayesian credible sets in regression with a gaussian process prior. Electron J Statist. 9(2), 2475–2527 (2015). https://doi.org/10.1214/15-EJS1078
Article MathSciNet MATH Google Scholar
Sundararajan, S., Keerthi, S.S.: Predictive approaches for choosing hyperparameters in gaussian processes. Neural Comput. 13(5), 1103–1118 (2001). https://doi.org/10.1162/08997660151134343
Article MATH Google Scholar
Szabó, B.T., van der Vaart, A.W., van Zanten, J.H.: Empirical bayes scaling of gaussian priors in the white noise model. Electron J Statist. 7, 991–1018 (2013). https://doi.org/10.1214/13-EJS798
Article MathSciNet MATH Google Scholar
Teckentrup, A.L.: Convergence of gaussian process regression with estimated hyper-parameters and applications in bayesian inverse problems. SIAM/ASA J Uncertain. Quantif. 8(4), 1310–1337 (2020). https://doi.org/10.1137/19M1284816
Article MathSciNet MATH Google Scholar
Tokdar, S.T., Ghosh, J.K.: Posterior consistency of logistic gaussian process priors in density estimation. Journal of Statistical Planning and Inference 137(1), 34–42 (2007). https://doi.org/10.1016/j.jspi.2005.09.005, URL http://www.sciencedirect.com/science/article/pii/S0378375805002648
Tuo, R., Wu, C.F.J.: Efficient calibration for imperfect computer models. Ann. Stat. 43, 2331–2352 (2015). https://doi.org/10.1214/15-AOS1314
Article MathSciNet MATH Google Scholar
Tuo, R., Wu, C.F.J.: A theoretical framework for calibration in computer models: parametrization, estimation and convergence properties. SIAM/ASA J Uncertain. Quantif. 4, 767–795 (2016). https://doi.org/10.1137/151005841
Article MathSciNet MATH Google Scholar
van der Vaart, A., Wellner, J.: Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer (1996), URL https://books.google.com/books?id=OCenCW9qmp4C
van der Vaart, A.W., van Zanten, J.H.: Rates of contraction of posterior distributions based on gaussian process priors. Ann. Statist. 36(3), 1435–1463 (2008). https://doi.org/10.1214/009053607000000613
Article MathSciNet MATH Google Scholar
Wahba, G.: Spline Models for Observational Data. Society for Industrial and Applied Mathematics (1990). https://doi.org/10.1137/1.9781611970128, URL https://epubs.siam.org/doi/abs/10.1137/1.9781611970128
Weizsäcker, CFv.: Zur theorie der kernmassen. Z Phys. 96(7), 431–458 (1935). https://doi.org/10.1007/BF01337700
Article MATH Google Scholar
Williams, B., Higdon, D., Gattiker, J., Moore, L., McKay, M., Keller-McNulty, S.: Combining experimental data and computer simulations, with an application to flyer plate experiments. Bayesian Anal. 1(4), 765–792 (2006)
MathSciNet MATH Google Scholar
Xie, F., Xu, Y.: Bayesian projected calibration of computer models. J Am. Stat. Assoc. 0(0), 1–18 (2020). https://doi.org/10.1080/01621459.2020.1753519
Article Google Scholar
Yuan, C.: Uncertainty decomposition method and its application to the liquid drop model. Phys Rev C 93, 034310 (2016). https://doi.org/10.1103/PhysRevC.93.034310, URL https://link.aps.org/doi/10.1103/PhysRevC.93.034310
Zhang, L., Jiang, Z., Choi, J., Lim, C.Y., Maiti, T., Baek, S.: Patient-specific prediction of abdominal aortic aneurysm expansion using Bayesian calibration. IEE Journal of Biomedical and Health Informatics (2019). https://doi.org/10.1109/JBHI.2019.2896034

Download references

Acknowledgements

The authors thank the reviewers and the Editor for their helpful comments and ideas. The research is partially supported by National Science Foundation funding DMS1952856.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Skidmore College, 815 North Broadway, Saratoga Springs, NY, 12866, USA
Vojtech Kejzlar
Department of Statistics and Probability, Michigan State University, 619 Red Cedar Road, East Lansing, MI, 48824, USA
Mookyong Son, Shrijita Bhattacharya & Tapabrata Maiti

Authors

Vojtech Kejzlar
View author publications
You can also search for this author in PubMed Google Scholar
Mookyong Son
View author publications
You can also search for this author in PubMed Google Scholar
Shrijita Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Tapabrata Maiti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vojtech Kejzlar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

: Equivalency of hierarchical model

To establish the equivalency between the Bayesian model given by the data likelihood $p(\varvec{d}|\varvec{\theta }, \varvec{\phi }, \sigma )$ and the hierarchical model (see Sect. 4), we need to show that the following equality holds

$$\begin{aligned} p(\varvec{d}|\varvec{\theta }, \varvec{\phi }, \sigma ) =\int _{\varvec{\zeta }} \prod _i^{n} p(y_i| \zeta _i, \sigma ) p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi }) \text {d}{\varvec{\zeta }}, \end{aligned}$$

(30)

where $\varvec{\zeta }= (\zeta (t_1), \dots ,\zeta (t_n)) = (\zeta _1, \dots ,\zeta _n)$ and the density $p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi })$ is the multivariate normal distribution with mean the mean $M(\varvec{\theta },\varvec{\phi })$ (see (6)) and the covariance

$$\begin{aligned}&K_p(\varvec{\theta },\varvec{\phi }) \\&\quad =\begin{pmatrix} K^{}_f(T_y(\varvec{\theta }), T_y(\varvec{\theta })) + K^{}_\delta (T_y, T_y) &{} K^{}_f( T_y(\varvec{\theta }), T_z({\widetilde{\varvec{\theta }}}))\\ K^{}_f(T_z({\widetilde{\varvec{\theta }}}), T_y(\varvec{\theta })) &{} K^{}_f(T_z({\widetilde{\varvec{\theta }}}), T_z({\widetilde{\varvec{\theta }}})) \end{pmatrix} \\&\quad =\begin{pmatrix} C_{11} &{} C_{12}\\ C_{21} &{} C_{22} \end{pmatrix}. \end{aligned}$$

For the ease of notation, let us now assume $M(\varvec{\theta },\varvec{\phi }) = (M_{y}^T, M_{z}^T)^T$. Then

$$\begin{aligned}&\int _{\varvec{\zeta }} \prod _i^{n} p(y_i| \zeta _i, \sigma ) p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi }) \text {d}{\varvec{\zeta }} \\&\quad = \int _{\varvec{\zeta }} \frac{1}{(2\pi )^{n/2} |\sigma ^2 I_n |^{1/2}} \text {exp}\bigg ( -\frac{1}{2} (\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg )\\&\qquad \times \frac{1}{(2\pi )^{(n+s)/2} |K_p |^{1/2}} \\&\qquad \times \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \text {d}\varvec{\zeta }\\&\quad = \frac{1}{(2\pi )^{(n+s)/2} |K |^{1/2}} \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \\&\qquad \times \int _{\varvec{\zeta }} \frac{|K |^{1/2}}{(2\pi )^{n/2} |\sigma ^2 I_n|^{1/2} |K_p |^{1/2}} \\&\qquad \times \text {exp}\bigg (-\frac{1}{2}(\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg ) \\&\quad \times \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \\&\qquad \times \text {exp}\bigg ( \frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \text {d}\varvec{\zeta }\\&\quad = \frac{1}{(2\pi )^{(n+s)/2} |K |^{1/2}} \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ). \end{aligned}$$

The integral is equal to 1 since it is an integration of multivariate normal probability density function over $\zeta $ with covariance $( (\sigma ^2 I_n)^{-1} + (C_{11} - C_{12}C^{-1}_{22} C_{21})^{-1})^{-1}$. Namely

$$\begin{aligned}&\frac{|K |^{1/2}}{|\sigma ^2 I_n|^{1/2} |K_p |^{1/2}} \\&\quad = \frac{|C_{22}|^{1/2} |C_{11} + \sigma ^2 I_n - C_{12}C^{-1}_{22} C_{21} |^{1/2}}{|\sigma ^2 I_n|^{1/2} |C_{22}|^{1/2} |C_{11} - C_{12}C^{-1}_{22} C_{21} |^{1/2}} \\&\quad = \frac{|C_{11} + \sigma ^2 I_n - C_{12}C^{-1}_{22} C_{21} |^{1/2}}{|\sigma ^2 I_n|^{1/2} |C_{11} - C_{12}C^{-1}_{22} C_{21} |^{1/2}} \\&\quad = \frac{|A + B|^{1/2}}{|A|^{1/2}|B|^{1/2}} = \frac{1}{|A|^{1/2}|B|^{1/2} |A + B|^{ -1/2}} \\&\quad = \frac{1}{(|A^{-1}||B^{-1}| |A + B|)^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}B^{-1}A + A^{-1}B^{-1}B|^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}B^{-1}A + A^{-1}|^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}(B^{-1} + A^{-1})A|^{ -1/2}} \\&\quad = \frac{1}{(|A^{-1}| |(B^{-1} + A^{-1})| |A|)^{ -1/2}} \\&\quad = \frac{1}{|(B^{-1} + A^{-1})^{-1}|^{1/2}} \end{aligned}$$

where we used the Schur complement identity for determinants in the first equality and

$$\begin{aligned} A&= C_{11} - C_{12}C^{-1}_{22} C_{21}, \\ B&= \sigma ^2 I_n. \end{aligned}$$

Lastly, considering the notation

$$\begin{aligned} K_p^{-1}= \begin{pmatrix} C^-_{11} &{} C^-_{12}\\ C^-_{21} &{} C^-_{22} \end{pmatrix} \end{aligned}$$

we have

$$\begin{aligned}&\exp {\bigg (-\frac{1}{2}(\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg )}\\&\quad \times \exp { \bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg )} \\ {}&\quad \times \exp {\bigg (\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg )} \\&\quad \propto \exp {\bigg (-\frac{1}{2}\varvec{\zeta }^T ((\sigma ^2 I_n)^{-1}+ C^-_{11})\varvec{\zeta }+\varvec{\zeta }^T \varvec{b}\bigg )}, \end{aligned}$$

where $C^-_{11} = C_{11} - C_{12}C^{-1}_{22} C_{21}$ and $\varvec{b}$ is a constant column vector. This shows that integral is indeed equal to 1 as stated, and the equality (30) holds.

Proof of Theorem 1

Note that for any $\epsilon > 0$, the posterior probability of interest $p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)$ can be bound from the above as

$$\begin{aligned}&p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\\&\quad \le p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \\&\qquad + 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| > \epsilon \}}, \end{aligned}$$

where

$$\begin{aligned}&p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \\&\quad \le \varPhi _{n} \\&\qquad + \frac{(1 -\varPhi _{n})\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}{\int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}\\&\qquad + \frac{\int _{U_n^c \cap \mathcal {F}^C_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}{\int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })} \\&\quad = \varPhi _{n} \\&\qquad + \frac{\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon )}{\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)} + \frac{\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon )}{\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)}. \end{aligned}$$

Since the assumption (A3) implies that $1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| > \epsilon \}}\xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0$, it is enough to show that there exists $\epsilon > 0$ so that

$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } \varPhi _{n} \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$

(31)

$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\beta _{1}n}\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$

(32)

$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\beta _{2}n}\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$

(33)

$$\begin{aligned}&{\inf _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon }} e^{\beta _{3}n}\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n) \xrightarrow [\text {n}]{\quad } \infty \text { a.s. } P_0, \end{aligned}$$

(34)

for some $\beta _1, \beta _2, \beta _3 > 0$ where $\beta _3 \le \min \{\beta _1, \beta _2\}$.

The rest of the proof follows the general steps of the proof of Theorem 1 in Choi and Schervish (2007) and Theorem 9 in Choi (2007) with some non-trivial treatment of the constant $\epsilon $. Similarly to Choi (2007), we assume that the covariance function is smooth enough so that the supremum of the variance of the Gaussian process is continuous with respect to ($\varvec{\theta }, \varvec{\phi }$) on the compact set $\varUpsilon $. We shall provide step by step details below.

Step 1) By Markov inequality, for any $\rho > 0$

$$\begin{aligned} \sum _{n=1}^{\infty }P_0(\varPhi _{n} > \rho ) \le \frac{1}{\rho } \sum _{n=1}^{\infty }\mathbb {E}_{\zeta _{0},\sigma _0}\varPhi _{n}, \end{aligned}$$

which due to the condition (i) of (A2) and the first Borel-Cantelli Lemma yields

$$\begin{aligned} \varPhi _{n} \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

Since this does not depend on $(\varvec{\theta }, \varvec{\phi })$, it implies (31).

Step 2) By Fubini’s theorem and for any $0<\epsilon < {\tilde{\epsilon }}_2$

$$\begin{aligned}&\mathbb {E}_{\zeta _{0},\sigma _0}(\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ))\\&= \mathbb {E}_{\zeta _{0},\sigma _0} \bigg [(1 -\varPhi _{n})\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\bigg ] \\&=\int _{U_n^c \cap \mathcal {F}_n} \int (1 -\varPhi _{n}) \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}}\text {d}P_0 \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }) \\&\le {\left( \frac{\sigma _{0}(1-\epsilon )}{\sigma _{0}(1+\epsilon )}\right) }^{-n} \int _{{U_{n}}^{C}\cap \mathcal {F}_{n}} \mathbb {E}_{\varvec{\zeta },\sigma _{0}(1+\epsilon )} [(1-\varPhi _{n})] \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }) \\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} \sup _{\zeta \in U_n^C \cap \mathcal {F}_n} \mathbb {E}_{\zeta ,\sigma _{0}(1+\epsilon )}[( 1-\varPhi _{n})] \\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} C_2 e^{-c_2 n} = C_2 e^{-\tilde{c}_\epsilon n}, \end{aligned}$$

where $\tilde{c}_\epsilon = c_2 + \log (1-\epsilon ) - \log (1+\epsilon )$ together with condition (iii) of (A2) implies $\tilde{c}_\epsilon > 0$. Thus

$$\begin{aligned} P_0 \bigg \{\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \ge e^{-\tilde{c}_\epsilon \frac{n}{2}} \bigg \}&\le C_2 e^{\tilde{c}_\epsilon \frac{n}{2}}e^{-\tilde{c}_\epsilon n} \\&= C_2e^{-\tilde{c}_\epsilon \frac{n}{2}}. \end{aligned}$$

Therefore, for any $\epsilon > 0$ so that $\epsilon < {\tilde{\epsilon }}_2$ there exists a constant $\tilde{c}_\epsilon $ for which the first Borel-Cantelli Lemma implies

$$\begin{aligned} e^{\tilde{c}_\epsilon \frac{n}{4}}\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

Since this does not depend on $(\varvec{\theta }, \varvec{\phi })$, it implies (32).

Step 3) If we proceed as in the step 2), the Fubini’s theorem implies

$$\begin{aligned}&\mathbb {E}_{\zeta _{0},\sigma _0}(\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ))\\&= \mathbb {E}_{\zeta _{0},\sigma _0} \bigg [\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\bigg ] \\&\le {\left( \frac{\sigma _{0}(1-\epsilon )}{\sigma _{0}(1+\epsilon )}\right) }^{-n} \int _{{U_{n}}^{C}\cap \mathcal {F}_{n}^C} \mathbb {E}_{\zeta ,\sigma _{0}(1+\epsilon )} [1] \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} \varPi (\mathcal {F}^C_n|\varvec{\theta }, \varvec{\phi }). \end{aligned}$$

The condition (ii) of (A2) and the first Borel-Cantelli Lemma implies that for any $\epsilon < \frac{1 - e^{-c_1}}{1 + e^{-c_1}}$:

$$\begin{aligned} \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\tilde{k}_\epsilon \frac{n}{4}}\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$

where $\tilde{k}_\epsilon = c_1 + \log (1 -\epsilon ) - \log (1+\epsilon )$.

Step 4) To prove (34), given any $0<\rho < 1$, we first observe the following:

$$\begin{aligned}&\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\\&\ge \mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \rho \}} \\&\ge {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }). \end{aligned}$$

Let us now define $\log _+(x)= \max \{0, \log (x)\}$ and $\log _-(x)= -\min \{0, \log (x)\}$ as well as

$$\begin{aligned} W_{i}&=\log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|{\zeta }_{i},\sigma _{0}(1-\rho ))}, \\ K_{i}^{+}(\zeta _0,\zeta )&= \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \text {d}y_{i},\\ K_{i}^{-}(\zeta _0,\zeta )&= \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \log _{-} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \text {d}y_{i}. \end{aligned}$$

Then we get

$$\begin{aligned}&\mathbb {V}ar_{\zeta _0, \sigma _0}(W_i) = \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) - \{ K_{i}^{+}(\zeta _0,\zeta ) \}^2\\&\quad \le \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\le \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) \\&\qquad + \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \left( \log _{-} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) ^{2} \text {d}y_{i} \\&\qquad - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\quad = \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \left( \log \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) ^{2} \text {d}y_{i} \\&\qquad - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\quad = V_i(\zeta _0,\zeta ). \end{aligned}$$

Hence, by condition (i) of (A1) for any $\rho < {\tilde{\epsilon }}_1$ and $\zeta \in B$

$$\begin{aligned} {\sum _{i=1}^{\infty }\frac{\mathbb {V}ar_{\zeta _0, \sigma _0}(W_i)}{i^2} \le \sum _{i=1}^{\infty }\frac{V_i(\zeta _0,\zeta )}{i^2} < \infty ,} \end{aligned}$$

and by the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}(W_{i}-K_{i}^{+}(\zeta _0,\zeta )) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

As a result, for every $\zeta \in B$, with $P_0$ probability 1

$$\begin{aligned}&\liminf _{n \rightarrow \infty } \bigg (\frac{1}{n}\sum _{i=1}^{n} \log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \bigg ) \\&\quad =-\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} -\log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \right) \\&\quad =-\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} \log \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) \\&\quad \ge -\limsup _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} \log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}\right) \\&\quad =-\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}^{+}(\zeta _0,\zeta )\right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(S_0,S)+\frac{1}{n}\sum _{i=1}^{n} \sqrt{\frac{K_{i}(\zeta _0,\zeta )}{2}} \; \right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta )+\sqrt{\frac{1}{n}\sum _{i=1}^{n} \frac{K_{i}(\zeta _0,\zeta )}{2}} \; \right) . \end{aligned}$$

The fourth line follows from the almost sure convergence proved in the previous paragraph, and the second to last line follows from Amewou-Atisso et al. (2003). We now make use of the condition (ii) of (A1). Let us consider $\beta > 0$ and select $\varDelta $ so that $\varDelta + \sqrt{\frac{\varDelta }{2}} \le \frac{\beta }{8}$ and also $C = B \cap \{\zeta : K_i(\zeta _0,\zeta ) < \varDelta \ \text { for all }i\}$. By (A1) there exists ${\tilde{\epsilon }}_1$ so that for all $0<\rho <{\tilde{\epsilon }}_1$ implies $\varPi (C|\varvec{\theta }, \varvec{\phi }) > 0$. Therefore, for each $\zeta \in C$

$$\begin{aligned}&\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta )+\sqrt{\frac{1}{n}\sum _{i=1}^{n} \frac{K_{i}(\zeta _0,\zeta )}{2}} \;\right) \\&\quad \ge -(\varDelta +\sqrt{\frac{\varDelta }{2}}), \end{aligned}$$

since $\frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta ) <\varDelta $ for all $\zeta \in C$. Finally, for any $\rho < \min \{{\tilde{\epsilon }}_1, \frac{1-e^{\frac{-\beta }{8}}}{1+e^{\frac{-\beta }{8}}}\}$

$$\begin{aligned}&\liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} \mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n) \\&\quad \ge \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad \ge \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{C} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad \ge \int _{C} \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad = \infty . \end{aligned}$$

Note that the actual bound on $\mathbf {I}_{3n}$ does not depend on $(\varvec{\theta }, \varvec{\phi })$. Taking $\epsilon < \min \{{\tilde{\epsilon }}_2, \frac{1 - e^{-c_1}}{1 + e^{-c_1}}\}$ concludes the proof.

Proof of Lemma 1

Theorem 5 of Ghosal and Roy (2006) implies that there exist positive constants $C, d_1, \dots , d_p$ so that for $i = 1, \dots , p$

$$\begin{aligned} P\left( \sup _{ \varvec{t}\in [0,1]^p}|\zeta (\varvec{t})|>M_{n}\bigg | \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_0 \frac{M_{n}^2}{\rho _{0}^{2}(\varvec{\theta }, \varvec{\phi })}}, \\ P\left( \sup _{ \varvec{t}\in [0,1]^p}\Big |\frac{\partial }{\partial t_{i}}\zeta (\varvec{t})\Big |>M_{n}| \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_i \frac{M_{n}^2}{\rho _{i}^{2}(\varvec{\theta }, \varvec{\phi })}}. \end{aligned}$$

The continuity of $\rho ^2_i(\varvec{\theta }, \varvec{\phi })$, for $i = 0,\cdots , p$, on a compact set $\varUpsilon $ implies that they are uniformly bounded. Therefore, there exist universal constants ${(c_{0,1},c_{0,2})}, \cdots $, ${(c_{p,1}, c_{p,2})}$ such that for $i = 0, \cdots , p$,

$$\begin{aligned} 0 < {c_{i,1}} \le \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } |\rho _i^2(\varvec{\theta }, \varvec{\phi })| \le {c_{i,2}}. \end{aligned}$$

Hence, for $i = 0, \cdots , p$,

$$\begin{aligned} \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } P\left( \sup _{ \varvec{t}\in [0,1]^p}|\zeta (\varvec{t})|>M_{n}\bigg | \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_0 \frac{M_{n}^2}{{c_{0,1}}}}, \\ \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } P\left( \sup _{ \varvec{t}\in [0,1]^p}\Big |\frac{\partial }{\partial t_{i}}\zeta (\varvec{t})\Big |>M_{n}| \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_i \frac{M_{n}^2}{{c_{i,1}}}}. \end{aligned}$$

Proof of Lemma 2

We shall first define some notation. Let $0<r <\frac{\nu }{2}$ and $t = \frac{r}{4}$. Let $N_t = N(t, \mathcal {F}_n, \parallel \cdot \parallel _{\infty })$ be the covering number of $\mathcal {F}_n$. In Theorem 2.7.1, van der Vaart and Wellner (1996) show that there exist a constant K so that $\log N_t \le \frac{KM_n}{t^p}$ and therefore $N_t = \mathcal {O}(M_n)$, where $M_n = \mathcal {O}(n^\alpha )$ for $\alpha \in (\frac{1}{2},1)$ according to the definition of the sieves. Let us consider $\tau \in (\frac{\alpha }{2}, \frac{1}{2})$ and define $c_n = n^{\tau }$ so that $\log (N_t) = o(c_n^2)$. Moreover, let $\zeta ^1, \dots , \zeta ^{N_t} \in \mathcal {F}_n$ be finitely many elements of the sieve so that for every $\zeta \in \mathcal {F}_n$ there is $i \in \{1, \dots , N_t\}$ satisfying $\parallel \zeta - \zeta ^i \parallel _\infty < t$. This implies that if $\zeta \in \mathcal {F}_n$ such that $\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu $, then $\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}$.

The next step in the proof is to construct a test for each $\zeta ^i$ with the resulting functions $\varPhi _n$ defined as a combination of the individual tests and showing that the probabilities of type I and type II errors satisfies the properties of the lemma. Let us recall that $\zeta _j = \zeta (\varvec{t}_j)$ and $\zeta _{0,j} = \zeta _0 (\varvec{t}_j)$. For an arbitrary $\zeta \in \mathcal {F}_n$ such that $\parallel \zeta - \zeta ^i \parallel _\infty < t$, let us define $\zeta _{1,j} = \zeta ^i(\varvec{t}_j)$ and $b_j = 1$ if $\zeta _{1,j} > \zeta _{0,j}$ and $-1$ otherwise. For any $\nu > 0$, let $\varPsi _n[\zeta , \nu ]$ be the indicator of set A defined as follows

$$\begin{aligned} A = \left\{ \sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) > 2c_{n}\sqrt{n} \right\} . \end{aligned}$$

The test functions $\varPhi _n$ are then

$$\begin{aligned} \varPhi _{n}= \max _{1\le j \le N_{t}}\varPsi _{n}[\zeta ^{j},\frac{\nu }{2}]. \end{aligned}$$

Type I error) The Mill’s ratio implies

$$\begin{aligned} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPsi _{n})&=P_{0}\left[ \sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) > 2c_{n}\sqrt{n}\right] \\&=1-\varPhi (2c_n)\\&\le \frac{1}{2c_n\sqrt{2\pi }} e^{-2c_n^2}\\&\le e^{-2c_n^2}. \end{aligned}$$

The function $\varPhi (\cdot )$ is the CDF of the standard normal distribution. Consequently, we have

$$\begin{aligned} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPhi _{n})&\le \sum _{j =1}^{N_t} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPsi _{n}[\zeta ^{j},\frac{\nu }{2}])\\&\le N_t e^{-2c_n^2} = e^{\log (N_t)-2c_n^2}\\&\le e^{-c_n^2}, \end{aligned}$$

and

$$\begin{aligned} \sum _{n=1}^{\infty }\mathbb {E}_{\zeta _{0},\sigma _0}\varPhi _{n} < \infty . \end{aligned}$$

Type II error) It is sufficient to find i for which the probability of type II error of $\varPsi _n[\zeta ^i, \frac{\nu }{2}]$, given an arbitrary $\zeta $ in $W^C_{\nu , n} \cap \mathcal {F}_n$, is sufficiently small. This is because the probability of type II error for the composite test $\varPhi _n$ is no larger than the smallest of $\varPsi _n[\zeta ^i, \frac{\nu }{2}]$. Note that here we assume $\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu $, and then $\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}$. For every $r < \frac{\nu }{2}$, Choi (2005) show that

$$\begin{aligned} \sum _{j=1}^n |\zeta _{1,j} - \zeta _{0,j}| > rn. \end{aligned}$$

Let n be large enough so that $4\sigma _0 c_n < r\sqrt{n}$, then for any $0<\epsilon <1$

$$\begin{aligned}&\mathbb {E}_{\zeta , \sigma _0 (1 + \epsilon )}(1 - \varPsi _n[\zeta ^i, \frac{\nu }{2}]) \\&= P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) \le 2c_{n}\sqrt{n}\Bigg ] \\&=P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [\frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0}\right) + \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{\zeta _{j}-\zeta _{1,j}}{\sigma _0}\right) \\&\quad + \frac{1}{\sqrt{n}}\sum _{j=1}^{n}\left| \frac{\zeta _{1,j}-\zeta _{0,j}}{\sigma _0}\right| \le 2 c_n \Bigg ]\\&\le P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [ \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0}\right) \le \frac{r \sqrt{n}}{4 \sigma _0} - \frac{r \sqrt{n}}{\sigma _0} + 2c_n\Bigg ]\\&\le P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [ \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0 (1+\epsilon )}\right) \le - \frac{r \sqrt{n}}{4 \sigma _0 (1+\epsilon )} \Bigg ]\\&=\varPhi \left( - \frac{r \sqrt{n}}{4 \sigma _0 (1+\epsilon )} \right) \\&\le \frac{4\sigma _0(1+\epsilon )}{r \sqrt{2 \pi n}}e^{-\frac{nr^2}{32 \sigma _0^2 (1+\epsilon )^2}}. \end{aligned}$$

To establish the part (ii) of the lemma, we need to show that there exists $0<{\tilde{\epsilon }}<1$ so that for any $\epsilon <{\tilde{\epsilon }}$

$$\begin{aligned} \frac{r^2}{32 \sigma _0^2 (1+\epsilon )^2} + \log \left( \frac{1-\epsilon }{1+\epsilon }\right) > 0. \end{aligned}$$

(35)

Take $\kappa = \frac{r^2}{32 \sigma _0^2}$ and define $b(\epsilon )$ to be the left hand side of (35),

$$\begin{aligned} b(\epsilon ) = \kappa \left( \frac{1}{(1+\epsilon )^2} + \frac{1}{\kappa } \log \left( \frac{1-\epsilon }{1+\epsilon }\right) \right) . \end{aligned}$$

The function $b(\epsilon )$ is clearly continuous at $\epsilon = 0$. Hence, for each $\kappa > 0$, there exists ${\tilde{\epsilon }}$ such that for all $0<\epsilon <{\tilde{\epsilon }}$, $b(\epsilon ) > 0$.

Proof of Theorem 2

First, we show that ${\hat{\sigma }}_n^2$ is asymptotically unbiased. Note that

$$\begin{aligned} \mathbb {E}[(y_{i+1} - y_i)^2]&= [\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2 + \sigma _0^2\mathbb {E}[(\epsilon _{i+1} - \epsilon _i)^2] \\&= [\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2 + 2\sigma _0^2, \end{aligned}$$

because $\epsilon _i {\mathop {\sim }\limits ^{i.i.d.}} \mathcal {N}(0,1)$. Consequently

$$\begin{aligned} \mathbb {E}({\hat{\sigma }}_n^2) =\frac{\sum _{i = 1}^{n-1}[\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2}{2(n - 1)} + \sigma _0^2. \end{aligned}$$

(36)

Since $\zeta _0$ is continuously differentiable on the compact and convex set $\varvec{\varOmega }$, it is also (globally) Lipschitz on $\varvec{\varOmega }$ (e.g. Schaeffer and Cain (2016), Corollary 3.2.4), and there exist a real constant K so that

$$\begin{aligned} |\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})| \le K \sum _{j = 1}^ p |t_{i+1,j} - t_{i,j}|. \end{aligned}$$

Therefore, due to the design assumption (AD)

$$\begin{aligned} \begin{aligned} 0&\le \frac{\sum _{i = 1}^{n-1}[\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2}{2(n - 1)} \\&\le \frac{K^2p^2}{2} \bigg [\sup _{i \in \{1, \dots , n\}, j \in \{1, \dots , p\}} |t_{i+1, j} - t_{i,j}|\bigg ]^2 \xrightarrow [\text {n}]{\quad \quad } 0, \end{aligned} \end{aligned}$$

(37)

and the combination of (36) with (37) implies

$$\begin{aligned} \mathbb {E}({\hat{\sigma }}_n^2) \xrightarrow [\text {n}]{\quad \quad } \sigma _0^2. \end{aligned}$$

(38)

To show the almost sure convergence of ${\hat{\sigma }}_n^2$, let us now denote $x_i = (y_{i + 1} - y_i)^2$ and rewrite the estimator ${\hat{\sigma }}_n^2$ as a sum of two estimators, each consisting of a sum of independent variables:

$$\begin{aligned} {\hat{\sigma }}_n^2 = \frac{\frac{1}{2} \sum _{i = 1}^{\frac{n -1 }{2}} x_{2i} }{2\big (\frac{n -1 }{2}\big )} + \frac{\frac{1}{2} \sum _{j = 1}^{\frac{n -1 }{2}} x_{2j-1} }{2\big (\frac{n -1 }{2}\big )} = {\hat{\sigma }}_{n,e}^2 + {\hat{\sigma }}_{n,o}^2. \end{aligned}$$

Without loss of generality, we assumed that n is an odd integer. Lastly note that $\mathbb {V}ar(x_i) \le C < \infty $ uniformly in i. This is because the differences $\zeta _0(\varvec{t}_{i+1}) - \zeta _0(\varvec{t}_{i})$ are uniformly bounded on the compact set $\varvec{\varOmega }$ due to the continuity of $\zeta _0$. Additionally, $y_{i+1} - y_i$ are normal and have bounded moments. We can now apply the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),

$$\begin{aligned}&{\hat{\sigma }}_{n,e}^2 \xrightarrow [\text {n}]{\quad \quad } \frac{1}{2}\sigma _0^2 \quad \text {a.s. } P_0 \\&{\hat{\sigma }}_{n,0}^2 \xrightarrow [\text {n}]{\quad \quad } \frac{1}{2}\sigma _0^2 \quad \text {a.s. } P_0 \end{aligned}$$

and as a result

$$\begin{aligned} {\hat{\sigma }}_n^2 = {\hat{\sigma }}_{n,e}^2 + {\hat{\sigma }}_{n,o}^2 \xrightarrow [\text {n}]{\quad \quad } \sigma _0^2 \quad \text {a.s. } P_0. \end{aligned}$$

The LDM calibration

The analysis of the LDM follows our previous study in Kejzlar and Maiti (2020). Here we provide a concise discussion regarding the choices of prior distributions for and the GP’s specification.

GP specifications. For the computer model $E_{\text {B}}(Z,N)$, we consider the GP prior with the mean zero and the covariance function

$$\begin{aligned} \eta _E \cdot \text {exp}&(-\frac{(Z - Z' )^2}{2\nu ^2_Z} -\frac{(N - N')^2}{2\nu ^2_N} -\frac{(\theta _{\text {vol}} - \theta _{\text {vol}}')^2}{2\nu ^2_1} \\&- \frac{(\theta _{\text {surf}} - \theta _{\text {surf}}' )^2}{2\nu ^2_2} -\frac{(\theta _{\text {sym}} - \theta _{\text {sym}}' )^2}{2\nu ^2_3} -\frac{(\theta _{\text {C}} - \theta _{\text {C}}' )^2}{2\nu ^2_4}). \end{aligned}$$

We also assume the GP prior for the systematic discrepancy $\delta (Z,N)$ with mean zero and covariance function

$$\begin{aligned} \eta _\delta \cdot \text {exp}(-\frac{(Z - Z' )^2}{2l^2_Z} -\frac{(N - N' )^2}{2l^2_N}). \end{aligned}$$

Prior distributions The prior distributions for the calibration parameters $\varvec{\theta }$ are chosen to be wide enough to cover the space of all their reasonable values:

$$\begin{aligned} \theta _{\text {vol}}&\sim \mathcal {N}(15.42, 0.203), \\ \theta _{\text {surf}}&\sim \mathcal {N}(16.91, 0.645), \\ \theta _{\text {sym}}&\sim \mathcal {N}(22.47, 0.525), \\ \theta _{\text {C}}&\sim \mathcal {N}(0.69, 0.015). \end{aligned}$$

The prior distributions for the hyperparameters $\varvec{\phi }$ were selected as $Gamma(\alpha , \beta )$ with the shape parameter $\alpha $ and scale parameter $\beta $. They are chosen to be weakly informative so that they correspond to the scale of these parameters given by the literature on nuclear mass models (Weizsäcker 1935; Bethe and Bacher 1936; Myers and Swiatecki 1966; Fayans 1998; Kirson 2008; McDonnell et al. 2015; Kortelainen et al. 2010, 2012, 2014; Benzaid et al. 2020; Kejzlar et al. 2020). In particular,

$$\begin{aligned} \sigma&\sim Gamma(2,1),\\ \eta _\delta&\sim Gamma(10,1),\\ l_Z&\sim Gamma(10,1), \\ l_N&\sim Gamma(10,1), \\ \nu _Z&\sim Gamma(10,1), \\ \nu _N&\sim Gamma(10,1), \\ \nu _i&\sim Gamma(10,1), i = 1,2,3,4. \end{aligned}$$

Since the majority of the masses in the training dataset are larger than 1000 MeV. We consider the following prior for $\eta _f$ to reflect this notion

$$\begin{aligned} \eta _f \sim Gamma(110,10). \end{aligned}$$

Numerical study of the conditional covariance $k_\zeta $

Here we present the numerical investigation of our conjecture about the asymptotic behavior of the conditional covariance $k_{\zeta }$. We show that with increasing number of model evaluations s (assuming some space filling design) the covariance function $k_{\delta }$ quickly dominates which strongly points out to similar asymptotic behavior of $k_{\zeta }$ and $k_{\delta }$ with respect to s. Our rational is that by informing the prior distribution for $\zeta $ with more model evaluations, we effectively reduce the uncertainty about the computer model.

1.1 Study design

We consider a simple scenario with the joint space of model and calibration inputs over $[0,1]^2$. The input pairs $(\tilde{t}_j, {\tilde{\theta }}_j)$ were generated using the space filling Latin hypercube design. The true value of calibration parameter was chosen to vary between $\theta = \{0.3, 0.5, 0.8\}$.

1.2 Results

Figures 6 and 7 show the values of $| k_{\zeta }(t_i,t_j) - k_{\delta }(t_i,t_j)|$ as a function of model runs in the case of squared exponential covariance kernels for both $k_f$ and $k_{\delta }$. The hyperparameter values were fixed to $\eta _f = \eta _{\delta } = 1$ with varying values for the length scales so that $l_f = l_{\delta }$.

Analogically, Figs. 8 and 9 correspond to the case of tensor-product Matérn kernels with the standard choice of smoothness parameter $\lambda = 2.5$ which is sufficient according to the theory discussed in Sect. 4.1.1. The remaining hyperparameters are same as in the case of squared exponential kernel.

We can see that both choices of kernel functions for $k_f$ and $k_{\delta }$ exhibit the hypothesized dominance of $k_{\delta }$ with the increasing number of model runs. This happens irrespective of the choice of kernel function, model inputs t, calibration parameters $\theta $, and the length scales. Particularly in the case of squared exponential kernel, the absolute difference between $k_{\zeta }$ and $k_{\delta }$ quickly decreases and reaches the limits of numerical stability. On the other hand, the rate of convergence is considerably slower for the Matérn kernel which is likely related to the limited smoothness of the kernel.

Additional results for simulation study: Transverse harmonic wave

The following figures shows additional results of the empirical Bayes fit under the transverse harmonic wave simulation study at the time locations $t=0$, $t=0.43, t=0.71$, and $t=1$ (Figs. 10, 11, 12, 13, 14, 15, 16, 17).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kejzlar, V., Son, M., Bhattacharya, S. et al. A fast and calibrated computer model emulator: an empirical Bayes approach. Stat Comput 31, 49 (2021). https://doi.org/10.1007/s11222-021-10024-8

Download citation

Received: 08 September 2020
Accepted: 02 June 2021
Published: 22 June 2021
DOI: https://doi.org/10.1007/s11222-021-10024-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast and calibrated computer model emulator: an empirical Bayes approach

Abstract

Access this article

Similar content being viewed by others

Gaussian Process-Based Sensitivity Analysis and Bayesian Model Calibration with GPMSA

Gaussian Process-Based Sensitivity Analysis and Bayesian Model Calibration with GPMSA

Improving Gaussian Process Emulators with Boundary Information

References

Acknowledgements