Abstract
Mathematical models implemented on a computer have become the driving force behind the acceleration of the cycle of scientific processes. This is because computer models are typically much faster and economical to run than physical experiments. In this work, we develop an empirical Bayes approach to predictions of physical quantities using a computer model, where we assume that the computer model under consideration needs to be calibrated and is computationally expensive. We propose a Gaussian process emulator and a Gaussian process model for the systematic discrepancy between the computer model and the underlying physical process. This allows for closed-form and easy-to-compute predictions given by a conditional distribution induced by the Gaussian processes. We provide a rigorous theoretical justification of the proposed approach by establishing posterior consistency of the estimated physical process. The computational efficiency of the methods is demonstrated in an extensive simulation study and a real data example. The newly established approach makes enhanced use of computer models both from practical and theoretical standpoints.
Similar content being viewed by others
References
Amewou-Atisso, M., Ghosal, S., Ghosh, J.K., Ramamoorthi, R.: Posterior consistency for semi-parametric regression problems. Bernoulli 9(2), 291–312 (2003). https://doi.org/10.3150/bj/1068128979
Audi, G., Wapstra, A., Thibault, C.: The AME2003 atomic mass evaluation: (ii). tables, graphs and references. Nuclear Physics A 729, 337–676 (2003), URL http://www.sciencedirect.com/science/article/pii/S0375947403018098
Bayarri, M.J., Berger, J.O., Paulo, R., Sacks, J., Cafeo, J.A., Cavendish, J., Lin, C.H., Tu, J.: A framework for validation of computer models. Technometrics 49, 138–154 (2007). https://doi.org/10.1198/004017007000000092
Belitser, E., Enikeeva, F.: Empirical bayesian test of the smoothness. Math. Methods of Stat. 17(1), 1–18 (2008). https://doi.org/10.3103/S1066530708010018
Benzaid, D., Bentridi, S., Kerraci, A., Amrani, N.: Bethe-Weizsäcker semiempirical mass formula coefficients 2019 update based on AME2016. Nucl. Sci. Tech. 31, 9 (2020). https://doi.org/10.1007/s41365-019-0718-8
Bertsch, G.F., Bingham, D.: Estimating parameter uncertainty in binding-energy models by the frequency-domain bootstrap. Physical Review Letters 119, 252501 (2017), URL https://link.aps.org/doi/10.1103/PhysRevLett.119.252501
Bethe, H.A., Bacher, R.F.: Nuclear physics a. stationary states of nuclei. Rev Mod Phys 8, 82–229 (1936), https://doi.org/10.1103/RevModPhys.8.82, URL https://link.aps.org/doi/10.1103/RevModPhys.8.82
Brynjarsdóttir, J., O’Hagan, A.: Learning about physical parameters: the importance of model discrepancy. Inverse Problems 30, 114007 (2014). https://doi.org/10.1088%2F0266-5611%2F30%2F11%2F114007
Choi, T.: Posterior consistency in nonparametric regression problems under gaussian process priors (2005)
Choi, T.: Alternative posterior consistency results in nonparametric binary regression using gaussian process priors. Journal of Statistical Planning and Inference 137(9), 2975–2983 (2007). https://doi.org/10.1016/j.jspi.2006.11.001, URL http://www.sciencedirect.com/science/article/pii/S0378375807000407
Choi, T., Schervish, M.J.: On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis 98(10), 1969–1987 (2007). https://doi.org/10.1016/j.jmva.2007.01.004, URL http://www.sciencedirect.com/science/article/pii/S0047259X07000048
Fayans, S.A.: Towards a universal nuclear density functional. J Exp. Theor. Phy. Lett. 68(3), 169–174 (1998). https://doi.org/10.1134/1.567841
Florens, J.P., Simoni, A.: Regularized posteriors in linear ill-posed inverse problems. Scandinavian Journal of Statistics 39(2), 214–235 (2012). https://doi.org/10.1111/j.1467-9469.2011.00784.x, URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9469.2011.00784.x
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., Rubin, D.: Bayesian Data Analysis, 3rd edn. CRC Pres (2013), URL https://books.google.com/books?id=ZXL6AQAAQBAJ
Ghosal, S., Roy, A.: Posterior consistency of gaussian process prior for nonparametric binary regression. Ann. Statist. 34(5), 2413–2429 (2006). https://doi.org/10.1214/009053606000000795
Gu, M., Wang, L.: Scaled Gaussian stochastic process for computer model calibration and prediction. SIAM/ASA J Uncertain. Quantif. 6(4), 1555–1583 (2018). https://doi.org/10.1137/17M1159890
Higdon, D., Gattiker, J., Williams, B., Rightley, M.: Computer model calibration using high-dimensional output. Journal of the American Statistical Association 103, 570–583 (2008), URL http://www.jstor.org/stable/27640080
Higdon, D., McDonnell, J.D., Schunck, N., Sarich, J., Wild, S.M.: A Bayesian approach for parameter estimation and prediction using a computationally intensive model. J Phy. G: Nuclear Part. Phys. 42(3), 034009 (2015). https://doi.org/10.1088/0954-3899/42/3/034009
Kejzlar, V., Maiti, T.: Variational inference with vine copulas: An efficient approach for bayesian computer model calibration (2020). 2003.12890
Kejzlar, V., Neufcourt, L., Nazarewicz, W., Reinhard, P.G.: Statistical aspects of nuclear mass models. Journal of Physics G: Nuclear and Particle Physics 47(9), 094001 (2020). https://doi.org/10.1088/1361-6471/ab907c, URL https://doi.org/10.1088%2F1361-6471%2Fab907c
Kennedy, M.C., O’Hagan, A.: Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 425–464 (2001). https://doi.org/10.1111/1467-9868.00294, URL https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00294
King, G.B., Lovell, A.E., Neufcourt, L., Nunes, F.M.: Direct comparison between Bayesian and frequentist uncertainty quantification for nuclear reactions. Phy. Rev. Lett. 122, 232502 (2019)
Kirson, M.W.: Mutual influence of terms in a semi-empirical mass formula. Nucl Phys A 798(1), 29–60 (2008). https://doi.org/10.1016/j.nuclphysa.2007.10.011, URL http://www.sciencedirect.com/science/article/pii/S0375947407007531
Knapik, B., Szabo, B., Vaart, A., Zanten, J.: Bayes procedures for adaptive inference in inverse problems for the white noise model. Probability Theory and Related Fields (2016). https://doi.org/10.1007/s00440-015-0619-7
Kortelainen, M., Lesinski, T., Moré, J.J., Nazarewicz, W., Sarich, J., Schunck, N., Stoitsov, M.V., Wild, S.M.: Nuclear energy density optimization. Phy. Rev.C 82(2), 024313 (2010). https://doi.org/10.1103/PhysRevC.82.024313
Kortelainen, M., McDonnell, J., Nazarewicz, W., Reinhard, P.G., Sarich, J., Schunck, N., Stoitsov, M.V., Wild, S.M.: Nuclear energy density optimization: large deformations. Phy. Rev. C 85, 024304 (2012). https://doi.org/10.1103/PhysRevC.85.024304
Kortelainen, M., McDonnell, J., Nazarewicz, W., Olsen, E., Reinhard, P.G., Sarich, J., Schunck, N., Wild, S.M., Davesne, D., Erler, J., Pastore, A.: Nuclear energy density optimization: Shell structure. Phys Rev C 89, 054314 (2014). https://doi.org/10.1103/PhysRevC.89.054314
Krane, K.: Introductory Nuclear Physics. Wiley (1987), URL https://books.google.com/books?id=ConwAAAAMAAJ
Martino, L., Laparra, V., Camps-Valls, G.: Probabilistic cross-validation estimators for gaussian process regression. In: 2017 25th European Signal Processing Conference (EUSIPCO), pp. 823–827 (2017)
McDonnell, J.D., Schunck, N., Higdon, D., Sarich, J., Wild, S.M., Nazarewicz, W.: Uncertainty quantification for nuclear density functional theory and information content of new measurements. Phy. Rev. Lett. 114(12), 122501 (2015). https://doi.org/10.1103/PhysRevLett.114.122501
Morris, M.D., Mitchell, T.J.: Exploratory designs for computational experiments. Journal of Statistical Planning and Inference 43(3), 381–402 (1995). https://doi.org/10.1016/0378-3758(94)00035-T, URL http://www.sciencedirect.com/science/article/pii/037837589400035T
Myers, W.D., Swiatecki, W.J.: Nuclear masses and deformations. Nucl Phys 81(2), 1–60 (1966). https://doi.org/10.1016/S0029-5582(66)80001-9, URL http://www.sciencedirect.com/science/article/pii/S0029558266800019
Neufcourt, L., Cao, Y., Nazarewicz, W., Viens, F.: Bayesian approach to model-based extrapolation of nuclear observables. Physical Review C 98, 034318 (2018), URL https://link.aps.org/doi/10.1103/PhysRevC.98.034318
Neufcourt, L., Cao, Y., Nazarewicz, W., Olsen, E., Viens, F.: Neutron drip line in the Ca region from Bayesian model averaging. Physical Review Letters 122, 062502 (2019), URL https://link.aps.org/doi/10.1103/PhysRevLett.122.062502
Plumlee, M.: Bayesian calibration of inexact computer models. J Am. Stat. Assoc. 112, 1274–1285 (2017). https://doi.org/10.1080/01621459.2016.1211016
Plumlee, M.: Computer model calibration with confidence and consistency. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 81(3), 519–545 (2019). https://doi.org/10.1111/rssb.12314, URL https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12314
Plumlee, M., Joseph, V.R., Yang, H.: Calibrating functional parameters in the ion channel models of cardiac cells. J Am. Stat. Assoc. 111, 500–509 (2016)
Pollard, D., Chang, W., Haran, M., Applegate, P., DeConto, R.: Large ensemble modeling of the last deglacial retreat of the West Antarctic Ice Sheet: comparison of simple and advanced statistical techniques. Geosci. Model Dev. 9(5), 1697–1723 (2016)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA (2006), URL www.GaussianProcess.org/gpml
Reinhard, P.G., Bender, M., Nazarewicz, W., Vertse, T.: From finite nuclei to the nuclear liquid drop: Leptodermous expansion based on self-consistent mean-field theory. Phys Rev C 73, 014309 (2006). https://doi.org/10.1103/PhysRevC.73.014309, URL https://link.aps.org/doi/10.1103/PhysRevC.73.014309
Rousseau, J., Szabo, B.: Asymptotic behaviour of the empirical bayes posteriors associated to maximum marginal likelihood estimator. Ann. Statist. 45(2), 833–865 (2017). https://doi.org/10.1214/16-AOS1469
Schaeffer, D.G., Cain, J.W.: Nonlinear Systems: Local Theory, Springer New York, New York, NY, pp. 79–109 (2016). https://doi.org/10.1007/978-1-4939-6389-8_3
Schunck, N., O’Neal, J., Grosskopf, M., Lawrence, E., Wild, S.M.: Calibration of energy density functionals with deformed nuclei. Journal of Physics G: Nuclear and Particle Physics (2020). URL http://iopscience.iop.org/10.1088/1361-6471/ab8745
Serra, P., Krivobokova, T.: Adaptive empirical bayesian smoothing splines. Bayesian Anal. 12(1), 219–238 (2017). https://doi.org/10.1214/16-BA997
Sexton, D.M.H., Murphy, J.M., Collins, M., Webb, M.J.: Multivariate probabilistic projections using imperfect climate models Part i: outline of methodology. Climate Dyn. 38(11), 2513–2542 (2012)
Shiryaev, A.N.: Convergence of Probability Measures. Central Limit Theorem, Springer New York, New York, NY, pp. 308–378 (1996). https://doi.org/10.1007/978-1-4757-2539-1_4
Sniekers, S., van der Vaart, A.: Adaptive bayesian credible sets in regression with a gaussian process prior. Electron J Statist. 9(2), 2475–2527 (2015). https://doi.org/10.1214/15-EJS1078
Sundararajan, S., Keerthi, S.S.: Predictive approaches for choosing hyperparameters in gaussian processes. Neural Comput. 13(5), 1103–1118 (2001). https://doi.org/10.1162/08997660151134343
Szabó, B.T., van der Vaart, A.W., van Zanten, J.H.: Empirical bayes scaling of gaussian priors in the white noise model. Electron J Statist. 7, 991–1018 (2013). https://doi.org/10.1214/13-EJS798
Teckentrup, A.L.: Convergence of gaussian process regression with estimated hyper-parameters and applications in bayesian inverse problems. SIAM/ASA J Uncertain. Quantif. 8(4), 1310–1337 (2020). https://doi.org/10.1137/19M1284816
Tokdar, S.T., Ghosh, J.K.: Posterior consistency of logistic gaussian process priors in density estimation. Journal of Statistical Planning and Inference 137(1), 34–42 (2007). https://doi.org/10.1016/j.jspi.2005.09.005, URL http://www.sciencedirect.com/science/article/pii/S0378375805002648
Tuo, R., Wu, C.F.J.: Efficient calibration for imperfect computer models. Ann. Stat. 43, 2331–2352 (2015). https://doi.org/10.1214/15-AOS1314
Tuo, R., Wu, C.F.J.: A theoretical framework for calibration in computer models: parametrization, estimation and convergence properties. SIAM/ASA J Uncertain. Quantif. 4, 767–795 (2016). https://doi.org/10.1137/151005841
van der Vaart, A., Wellner, J.: Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer (1996), URL https://books.google.com/books?id=OCenCW9qmp4C
van der Vaart, A.W., van Zanten, J.H.: Rates of contraction of posterior distributions based on gaussian process priors. Ann. Statist. 36(3), 1435–1463 (2008). https://doi.org/10.1214/009053607000000613
Wahba, G.: Spline Models for Observational Data. Society for Industrial and Applied Mathematics (1990). https://doi.org/10.1137/1.9781611970128, URL https://epubs.siam.org/doi/abs/10.1137/1.9781611970128
Weizsäcker, CFv.: Zur theorie der kernmassen. Z Phys. 96(7), 431–458 (1935). https://doi.org/10.1007/BF01337700
Williams, B., Higdon, D., Gattiker, J., Moore, L., McKay, M., Keller-McNulty, S.: Combining experimental data and computer simulations, with an application to flyer plate experiments. Bayesian Anal. 1(4), 765–792 (2006)
Xie, F., Xu, Y.: Bayesian projected calibration of computer models. J Am. Stat. Assoc. 0(0), 1–18 (2020). https://doi.org/10.1080/01621459.2020.1753519
Yuan, C.: Uncertainty decomposition method and its application to the liquid drop model. Phys Rev C 93, 034310 (2016). https://doi.org/10.1103/PhysRevC.93.034310, URL https://link.aps.org/doi/10.1103/PhysRevC.93.034310
Zhang, L., Jiang, Z., Choi, J., Lim, C.Y., Maiti, T., Baek, S.: Patient-specific prediction of abdominal aortic aneurysm expansion using Bayesian calibration. IEE Journal of Biomedical and Health Informatics (2019). https://doi.org/10.1109/JBHI.2019.2896034
Acknowledgements
The authors thank the reviewers and the Editor for their helpful comments and ideas. The research is partially supported by National Science Foundation funding DMS1952856.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
: Equivalency of hierarchical model
To establish the equivalency between the Bayesian model given by the data likelihood \(p(\varvec{d}|\varvec{\theta }, \varvec{\phi }, \sigma )\) and the hierarchical model (see Sect. 4), we need to show that the following equality holds
where \(\varvec{\zeta }= (\zeta (t_1), \dots ,\zeta (t_n)) = (\zeta _1, \dots ,\zeta _n)\) and the density \(p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi })\) is the multivariate normal distribution with mean the mean \(M(\varvec{\theta },\varvec{\phi })\) (see (6)) and the covariance
For the ease of notation, let us now assume \(M(\varvec{\theta },\varvec{\phi }) = (M_{y}^T, M_{z}^T)^T\). Then
The integral is equal to 1 since it is an integration of multivariate normal probability density function over \(\zeta \) with covariance \(( (\sigma ^2 I_n)^{-1} + (C_{11} - C_{12}C^{-1}_{22} C_{21})^{-1})^{-1}\). Namely
where we used the Schur complement identity for determinants in the first equality and
Lastly, considering the notation
we have
where \(C^-_{11} = C_{11} - C_{12}C^{-1}_{22} C_{21}\) and \(\varvec{b}\) is a constant column vector. This shows that integral is indeed equal to 1 as stated, and the equality (30) holds.
Proof of Theorem 1
Note that for any \(\epsilon > 0\), the posterior probability of interest \(p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\) can be bound from the above as
where
Since the assumption (A3) implies that \(1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| > \epsilon \}}\xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0\), it is enough to show that there exists \(\epsilon > 0\) so that
for some \(\beta _1, \beta _2, \beta _3 > 0\) where \(\beta _3 \le \min \{\beta _1, \beta _2\}\).
The rest of the proof follows the general steps of the proof of Theorem 1 in Choi and Schervish (2007) and Theorem 9 in Choi (2007) with some non-trivial treatment of the constant \(\epsilon \). Similarly to Choi (2007), we assume that the covariance function is smooth enough so that the supremum of the variance of the Gaussian process is continuous with respect to (\(\varvec{\theta }, \varvec{\phi }\)) on the compact set \(\varUpsilon \). We shall provide step by step details below.
Step 1) By Markov inequality, for any \(\rho > 0\)
which due to the condition (i) of (A2) and the first Borel-Cantelli Lemma yields
Since this does not depend on \((\varvec{\theta }, \varvec{\phi })\), it implies (31).
Step 2) By Fubini’s theorem and for any \(0<\epsilon < {\tilde{\epsilon }}_2\)
where \(\tilde{c}_\epsilon = c_2 + \log (1-\epsilon ) - \log (1+\epsilon )\) together with condition (iii) of (A2) implies \(\tilde{c}_\epsilon > 0\). Thus
Therefore, for any \(\epsilon > 0\) so that \(\epsilon < {\tilde{\epsilon }}_2\) there exists a constant \(\tilde{c}_\epsilon \) for which the first Borel-Cantelli Lemma implies
Since this does not depend on \((\varvec{\theta }, \varvec{\phi })\), it implies (32).
Step 3) If we proceed as in the step 2), the Fubini’s theorem implies
The condition (ii) of (A2) and the first Borel-Cantelli Lemma implies that for any \(\epsilon < \frac{1 - e^{-c_1}}{1 + e^{-c_1}}\):
where \(\tilde{k}_\epsilon = c_1 + \log (1 -\epsilon ) - \log (1+\epsilon )\).
Step 4) To prove (34), given any \(0<\rho < 1\), we first observe the following:
Let us now define \(\log _+(x)= \max \{0, \log (x)\}\) and \(\log _-(x)= -\min \{0, \log (x)\}\) as well as
Then we get
Hence, by condition (i) of (A1) for any \(\rho < {\tilde{\epsilon }}_1\) and \(\zeta \in B\)
and by the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),
As a result, for every \(\zeta \in B\), with \(P_0\) probability 1
The fourth line follows from the almost sure convergence proved in the previous paragraph, and the second to last line follows from Amewou-Atisso et al. (2003). We now make use of the condition (ii) of (A1). Let us consider \(\beta > 0\) and select \(\varDelta \) so that \(\varDelta + \sqrt{\frac{\varDelta }{2}} \le \frac{\beta }{8}\) and also \(C = B \cap \{\zeta : K_i(\zeta _0,\zeta ) < \varDelta \ \text { for all }i\}\). By (A1) there exists \({\tilde{\epsilon }}_1\) so that for all \(0<\rho <{\tilde{\epsilon }}_1\) implies \(\varPi (C|\varvec{\theta }, \varvec{\phi }) > 0\). Therefore, for each \(\zeta \in C\)
since \(\frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta ) <\varDelta \) for all \(\zeta \in C\). Finally, for any \(\rho < \min \{{\tilde{\epsilon }}_1, \frac{1-e^{\frac{-\beta }{8}}}{1+e^{\frac{-\beta }{8}}}\}\)
Note that the actual bound on \(\mathbf {I}_{3n}\) does not depend on \((\varvec{\theta }, \varvec{\phi })\). Taking \(\epsilon < \min \{{\tilde{\epsilon }}_2, \frac{1 - e^{-c_1}}{1 + e^{-c_1}}\}\) concludes the proof.
Proof of Lemma 1
Theorem 5 of Ghosal and Roy (2006) implies that there exist positive constants \(C, d_1, \dots , d_p\) so that for \(i = 1, \dots , p\)
The continuity of \(\rho ^2_i(\varvec{\theta }, \varvec{\phi })\), for \(i = 0,\cdots , p\), on a compact set \(\varUpsilon \) implies that they are uniformly bounded. Therefore, there exist universal constants \({(c_{0,1},c_{0,2})}, \cdots \), \({(c_{p,1}, c_{p,2})}\) such that for \(i = 0, \cdots , p\),
Hence, for \(i = 0, \cdots , p\),
Proof of Lemma 2
We shall first define some notation. Let \(0<r <\frac{\nu }{2}\) and \(t = \frac{r}{4}\). Let \(N_t = N(t, \mathcal {F}_n, \parallel \cdot \parallel _{\infty })\) be the covering number of \(\mathcal {F}_n\). In Theorem 2.7.1, van der Vaart and Wellner (1996) show that there exist a constant K so that \(\log N_t \le \frac{KM_n}{t^p}\) and therefore \(N_t = \mathcal {O}(M_n)\), where \(M_n = \mathcal {O}(n^\alpha )\) for \(\alpha \in (\frac{1}{2},1)\) according to the definition of the sieves. Let us consider \(\tau \in (\frac{\alpha }{2}, \frac{1}{2})\) and define \(c_n = n^{\tau }\) so that \(\log (N_t) = o(c_n^2)\). Moreover, let \(\zeta ^1, \dots , \zeta ^{N_t} \in \mathcal {F}_n\) be finitely many elements of the sieve so that for every \(\zeta \in \mathcal {F}_n\) there is \(i \in \{1, \dots , N_t\}\) satisfying \(\parallel \zeta - \zeta ^i \parallel _\infty < t\). This implies that if \(\zeta \in \mathcal {F}_n\) such that \(\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu \), then \(\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}\).
The next step in the proof is to construct a test for each \(\zeta ^i\) with the resulting functions \(\varPhi _n\) defined as a combination of the individual tests and showing that the probabilities of type I and type II errors satisfies the properties of the lemma. Let us recall that \(\zeta _j = \zeta (\varvec{t}_j)\) and \(\zeta _{0,j} = \zeta _0 (\varvec{t}_j)\). For an arbitrary \(\zeta \in \mathcal {F}_n\) such that \(\parallel \zeta - \zeta ^i \parallel _\infty < t\), let us define \(\zeta _{1,j} = \zeta ^i(\varvec{t}_j)\) and \(b_j = 1\) if \(\zeta _{1,j} > \zeta _{0,j}\) and \(-1\) otherwise. For any \(\nu > 0\), let \(\varPsi _n[\zeta , \nu ]\) be the indicator of set A defined as follows
The test functions \(\varPhi _n\) are then
Type I error) The Mill’s ratio implies
The function \(\varPhi (\cdot )\) is the CDF of the standard normal distribution. Consequently, we have
and
Type II error) It is sufficient to find i for which the probability of type II error of \(\varPsi _n[\zeta ^i, \frac{\nu }{2}]\), given an arbitrary \(\zeta \) in \(W^C_{\nu , n} \cap \mathcal {F}_n\), is sufficiently small. This is because the probability of type II error for the composite test \(\varPhi _n\) is no larger than the smallest of \(\varPsi _n[\zeta ^i, \frac{\nu }{2}]\). Note that here we assume \(\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu \), and then \(\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}\). For every \(r < \frac{\nu }{2}\), Choi (2005) show that
Let n be large enough so that \(4\sigma _0 c_n < r\sqrt{n}\), then for any \(0<\epsilon <1\)
To establish the part (ii) of the lemma, we need to show that there exists \(0<{\tilde{\epsilon }}<1\) so that for any \(\epsilon <{\tilde{\epsilon }}\)
Take \(\kappa = \frac{r^2}{32 \sigma _0^2}\) and define \(b(\epsilon )\) to be the left hand side of (35),
The function \(b(\epsilon )\) is clearly continuous at \(\epsilon = 0\). Hence, for each \(\kappa > 0\), there exists \({\tilde{\epsilon }}\) such that for all \(0<\epsilon <{\tilde{\epsilon }}\), \(b(\epsilon ) > 0\).
Proof of Theorem 2
First, we show that \({\hat{\sigma }}_n^2\) is asymptotically unbiased. Note that
because \(\epsilon _i {\mathop {\sim }\limits ^{i.i.d.}} \mathcal {N}(0,1)\). Consequently
Since \(\zeta _0\) is continuously differentiable on the compact and convex set \(\varvec{\varOmega }\), it is also (globally) Lipschitz on \(\varvec{\varOmega }\) (e.g. Schaeffer and Cain (2016), Corollary 3.2.4), and there exist a real constant K so that
Therefore, due to the design assumption (AD)
and the combination of (36) with (37) implies
To show the almost sure convergence of \({\hat{\sigma }}_n^2\), let us now denote \(x_i = (y_{i + 1} - y_i)^2\) and rewrite the estimator \({\hat{\sigma }}_n^2\) as a sum of two estimators, each consisting of a sum of independent variables:
Without loss of generality, we assumed that n is an odd integer. Lastly note that \(\mathbb {V}ar(x_i) \le C < \infty \) uniformly in i. This is because the differences \(\zeta _0(\varvec{t}_{i+1}) - \zeta _0(\varvec{t}_{i})\) are uniformly bounded on the compact set \(\varvec{\varOmega }\) due to the continuity of \(\zeta _0\). Additionally, \(y_{i+1} - y_i\) are normal and have bounded moments. We can now apply the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),
and as a result
The LDM calibration
The analysis of the LDM follows our previous study in Kejzlar and Maiti (2020). Here we provide a concise discussion regarding the choices of prior distributions for and the GP’s specification.
GP specifications. For the computer model \(E_{\text {B}}(Z,N)\), we consider the GP prior with the mean zero and the covariance function
We also assume the GP prior for the systematic discrepancy \(\delta (Z,N)\) with mean zero and covariance function
Prior distributions The prior distributions for the calibration parameters \(\varvec{\theta }\) are chosen to be wide enough to cover the space of all their reasonable values:
The prior distributions for the hyperparameters \(\varvec{\phi }\) were selected as \(Gamma(\alpha , \beta )\) with the shape parameter \(\alpha \) and scale parameter \(\beta \). They are chosen to be weakly informative so that they correspond to the scale of these parameters given by the literature on nuclear mass models (Weizsäcker 1935; Bethe and Bacher 1936; Myers and Swiatecki 1966; Fayans 1998; Kirson 2008; McDonnell et al. 2015; Kortelainen et al. 2010, 2012, 2014; Benzaid et al. 2020; Kejzlar et al. 2020). In particular,
Since the majority of the masses in the training dataset are larger than 1000 MeV. We consider the following prior for \(\eta _f\) to reflect this notion
Numerical study of the conditional covariance \(k_\zeta \)
Here we present the numerical investigation of our conjecture about the asymptotic behavior of the conditional covariance \(k_{\zeta }\). We show that with increasing number of model evaluations s (assuming some space filling design) the covariance function \(k_{\delta }\) quickly dominates which strongly points out to similar asymptotic behavior of \(k_{\zeta }\) and \(k_{\delta }\) with respect to s. Our rational is that by informing the prior distribution for \(\zeta \) with more model evaluations, we effectively reduce the uncertainty about the computer model.
1.1 Study design
We consider a simple scenario with the joint space of model and calibration inputs over \([0,1]^2\). The input pairs \((\tilde{t}_j, {\tilde{\theta }}_j)\) were generated using the space filling Latin hypercube design. The true value of calibration parameter was chosen to vary between \(\theta = \{0.3, 0.5, 0.8\}\).
1.2 Results
Figures 6 and 7 show the values of \(| k_{\zeta }(t_i,t_j) - k_{\delta }(t_i,t_j)|\) as a function of model runs in the case of squared exponential covariance kernels for both \(k_f\) and \(k_{\delta }\). The hyperparameter values were fixed to \(\eta _f = \eta _{\delta } = 1\) with varying values for the length scales so that \(l_f = l_{\delta }\).
Analogically, Figs. 8 and 9 correspond to the case of tensor-product Matérn kernels with the standard choice of smoothness parameter \(\lambda = 2.5\) which is sufficient according to the theory discussed in Sect. 4.1.1. The remaining hyperparameters are same as in the case of squared exponential kernel.
We can see that both choices of kernel functions for \(k_f\) and \(k_{\delta }\) exhibit the hypothesized dominance of \(k_{\delta }\) with the increasing number of model runs. This happens irrespective of the choice of kernel function, model inputs t, calibration parameters \(\theta \), and the length scales. Particularly in the case of squared exponential kernel, the absolute difference between \(k_{\zeta }\) and \(k_{\delta }\) quickly decreases and reaches the limits of numerical stability. On the other hand, the rate of convergence is considerably slower for the Matérn kernel which is likely related to the limited smoothness of the kernel.
Additional results for simulation study: Transverse harmonic wave
The following figures shows additional results of the empirical Bayes fit under the transverse harmonic wave simulation study at the time locations \(t=0\), \(t=0.43, t=0.71\), and \(t=1\) (Figs. 10, 11, 12, 13, 14, 15, 16, 17).
Rights and permissions
About this article
Cite this article
Kejzlar, V., Son, M., Bhattacharya, S. et al. A fast and calibrated computer model emulator: an empirical Bayes approach. Stat Comput 31, 49 (2021). https://doi.org/10.1007/s11222-021-10024-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-021-10024-8