Skip to main content
Log in

A fast and calibrated computer model emulator: an empirical Bayes approach

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Mathematical models implemented on a computer have become the driving force behind the acceleration of the cycle of scientific processes. This is because computer models are typically much faster and economical to run than physical experiments. In this work, we develop an empirical Bayes approach to predictions of physical quantities using a computer model, where we assume that the computer model under consideration needs to be calibrated and is computationally expensive. We propose a Gaussian process emulator and a Gaussian process model for the systematic discrepancy between the computer model and the underlying physical process. This allows for closed-form and easy-to-compute predictions given by a conditional distribution induced by the Gaussian processes. We provide a rigorous theoretical justification of the proposed approach by establishing posterior consistency of the estimated physical process. The computational efficiency of the methods is demonstrated in an extensive simulation study and a real data example. The newly established approach makes enhanced use of computer models both from practical and theoretical standpoints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

Download references

Acknowledgements

The authors thank the reviewers and the Editor for their helpful comments and ideas. The research is partially supported by National Science Foundation funding DMS1952856.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vojtech Kejzlar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

:  Equivalency of hierarchical model

To establish the equivalency between the Bayesian model given by the data likelihood \(p(\varvec{d}|\varvec{\theta }, \varvec{\phi }, \sigma )\) and the hierarchical model (see Sect. 4), we need to show that the following equality holds

$$\begin{aligned} p(\varvec{d}|\varvec{\theta }, \varvec{\phi }, \sigma ) =\int _{\varvec{\zeta }} \prod _i^{n} p(y_i| \zeta _i, \sigma ) p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi }) \text {d}{\varvec{\zeta }}, \end{aligned}$$
(30)

where \(\varvec{\zeta }= (\zeta (t_1), \dots ,\zeta (t_n)) = (\zeta _1, \dots ,\zeta _n)\) and the density \(p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi })\) is the multivariate normal distribution with mean the mean \(M(\varvec{\theta },\varvec{\phi })\) (see (6)) and the covariance

$$\begin{aligned}&K_p(\varvec{\theta },\varvec{\phi }) \\&\quad =\begin{pmatrix} K^{}_f(T_y(\varvec{\theta }), T_y(\varvec{\theta })) + K^{}_\delta (T_y, T_y) &{} K^{}_f( T_y(\varvec{\theta }), T_z({\widetilde{\varvec{\theta }}}))\\ K^{}_f(T_z({\widetilde{\varvec{\theta }}}), T_y(\varvec{\theta })) &{} K^{}_f(T_z({\widetilde{\varvec{\theta }}}), T_z({\widetilde{\varvec{\theta }}})) \end{pmatrix} \\&\quad =\begin{pmatrix} C_{11} &{} C_{12}\\ C_{21} &{} C_{22} \end{pmatrix}. \end{aligned}$$

For the ease of notation, let us now assume \(M(\varvec{\theta },\varvec{\phi }) = (M_{y}^T, M_{z}^T)^T\). Then

$$\begin{aligned}&\int _{\varvec{\zeta }} \prod _i^{n} p(y_i| \zeta _i, \sigma ) p(\varvec{\zeta }, \varvec{z}| \varvec{\theta }, \varvec{\phi }) \text {d}{\varvec{\zeta }} \\&\quad = \int _{\varvec{\zeta }} \frac{1}{(2\pi )^{n/2} |\sigma ^2 I_n |^{1/2}} \text {exp}\bigg ( -\frac{1}{2} (\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg )\\&\qquad \times \frac{1}{(2\pi )^{(n+s)/2} |K_p |^{1/2}} \\&\qquad \times \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \text {d}\varvec{\zeta }\\&\quad = \frac{1}{(2\pi )^{(n+s)/2} |K |^{1/2}} \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \\&\qquad \times \int _{\varvec{\zeta }} \frac{|K |^{1/2}}{(2\pi )^{n/2} |\sigma ^2 I_n|^{1/2} |K_p |^{1/2}} \\&\qquad \times \text {exp}\bigg (-\frac{1}{2}(\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg ) \\&\quad \times \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \\&\qquad \times \text {exp}\bigg ( \frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ) \text {d}\varvec{\zeta }\\&\quad = \frac{1}{(2\pi )^{(n+s)/2} |K |^{1/2}} \text {exp}\bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg ). \end{aligned}$$

The integral is equal to 1 since it is an integration of multivariate normal probability density function over \(\zeta \) with covariance \(( (\sigma ^2 I_n)^{-1} + (C_{11} - C_{12}C^{-1}_{22} C_{21})^{-1})^{-1}\). Namely

$$\begin{aligned}&\frac{|K |^{1/2}}{|\sigma ^2 I_n|^{1/2} |K_p |^{1/2}} \\&\quad = \frac{|C_{22}|^{1/2} |C_{11} + \sigma ^2 I_n - C_{12}C^{-1}_{22} C_{21} |^{1/2}}{|\sigma ^2 I_n|^{1/2} |C_{22}|^{1/2} |C_{11} - C_{12}C^{-1}_{22} C_{21} |^{1/2}} \\&\quad = \frac{|C_{11} + \sigma ^2 I_n - C_{12}C^{-1}_{22} C_{21} |^{1/2}}{|\sigma ^2 I_n|^{1/2} |C_{11} - C_{12}C^{-1}_{22} C_{21} |^{1/2}} \\&\quad = \frac{|A + B|^{1/2}}{|A|^{1/2}|B|^{1/2}} = \frac{1}{|A|^{1/2}|B|^{1/2} |A + B|^{ -1/2}} \\&\quad = \frac{1}{(|A^{-1}||B^{-1}| |A + B|)^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}B^{-1}A + A^{-1}B^{-1}B|^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}B^{-1}A + A^{-1}|^{ -1/2}} \\&\quad = \frac{1}{|A^{-1}(B^{-1} + A^{-1})A|^{ -1/2}} \\&\quad = \frac{1}{(|A^{-1}| |(B^{-1} + A^{-1})| |A|)^{ -1/2}} \\&\quad = \frac{1}{|(B^{-1} + A^{-1})^{-1}|^{1/2}} \end{aligned}$$

where we used the Schur complement identity for determinants in the first equality and

$$\begin{aligned} A&= C_{11} - C_{12}C^{-1}_{22} C_{21}, \\ B&= \sigma ^2 I_n. \end{aligned}$$

Lastly, considering the notation

$$\begin{aligned} K_p^{-1}= \begin{pmatrix} C^-_{11} &{} C^-_{12}\\ C^-_{21} &{} C^-_{22} \end{pmatrix} \end{aligned}$$

we have

$$\begin{aligned}&\exp {\bigg (-\frac{1}{2}(\varvec{y}- \varvec{\zeta })^T(\sigma ^2I_n)^{-1}(\varvec{y}- \varvec{\zeta })\bigg )}\\&\quad \times \exp { \bigg (-\frac{1}{2} \left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K_p^{-1}\left( {\begin{array}{c}\varvec{\zeta }- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg )} \\ {}&\quad \times \exp {\bigg (\frac{1}{2} \left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) ^T K^{-1}\left( {\begin{array}{c}\varvec{y}- M_{y}\\ \varvec{z}- M_{z}\end{array}}\right) \bigg )} \\&\quad \propto \exp {\bigg (-\frac{1}{2}\varvec{\zeta }^T ((\sigma ^2 I_n)^{-1}+ C^-_{11})\varvec{\zeta }+\varvec{\zeta }^T \varvec{b}\bigg )}, \end{aligned}$$

where \(C^-_{11} = C_{11} - C_{12}C^{-1}_{22} C_{21}\) and \(\varvec{b}\) is a constant column vector. This shows that integral is indeed equal to 1 as stated, and the equality (30) holds.

Proof of Theorem 1

Note that for any \(\epsilon > 0\), the posterior probability of interest \(p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\) can be bound from the above as

$$\begin{aligned}&p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\\&\quad \le p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \\&\qquad + 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| > \epsilon \}}, \end{aligned}$$

where

$$\begin{aligned}&p(\zeta \in U_n^C|y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \\&\quad \le \varPhi _{n} \\&\qquad + \frac{(1 -\varPhi _{n})\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}{\int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}\\&\qquad + \frac{\int _{U_n^c \cap \mathcal {F}^C_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })}{\int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })} \\&\quad = \varPhi _{n} \\&\qquad + \frac{\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon )}{\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)} + \frac{\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon )}{\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)}. \end{aligned}$$

Since the assumption (A3) implies that \(1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| > \epsilon \}}\xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0\), it is enough to show that there exists \(\epsilon > 0\) so that

$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } \varPhi _{n} \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$
(31)
$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\beta _{1}n}\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$
(32)
$$\begin{aligned}&\sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\beta _{2}n}\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$
(33)
$$\begin{aligned}&{\inf _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon }} e^{\beta _{3}n}\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n) \xrightarrow [\text {n}]{\quad } \infty \text { a.s. } P_0, \end{aligned}$$
(34)

for some \(\beta _1, \beta _2, \beta _3 > 0\) where \(\beta _3 \le \min \{\beta _1, \beta _2\}\).

The rest of the proof follows the general steps of the proof of Theorem 1 in Choi and Schervish (2007) and Theorem 9 in Choi (2007) with some non-trivial treatment of the constant \(\epsilon \). Similarly to Choi (2007), we assume that the covariance function is smooth enough so that the supremum of the variance of the Gaussian process is continuous with respect to (\(\varvec{\theta }, \varvec{\phi }\)) on the compact set \(\varUpsilon \). We shall provide step by step details below.

Step 1) By Markov inequality, for any \(\rho > 0\)

$$\begin{aligned} \sum _{n=1}^{\infty }P_0(\varPhi _{n} > \rho ) \le \frac{1}{\rho } \sum _{n=1}^{\infty }\mathbb {E}_{\zeta _{0},\sigma _0}\varPhi _{n}, \end{aligned}$$

which due to the condition (i) of (A2) and the first Borel-Cantelli Lemma yields

$$\begin{aligned} \varPhi _{n} \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

Since this does not depend on \((\varvec{\theta }, \varvec{\phi })\), it implies (31).

Step 2) By Fubini’s theorem and for any \(0<\epsilon < {\tilde{\epsilon }}_2\)

$$\begin{aligned}&\mathbb {E}_{\zeta _{0},\sigma _0}(\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ))\\&= \mathbb {E}_{\zeta _{0},\sigma _0} \bigg [(1 -\varPhi _{n})\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\bigg ] \\&=\int _{U_n^c \cap \mathcal {F}_n} \int (1 -\varPhi _{n}) \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}}\text {d}P_0 \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }) \\&\le {\left( \frac{\sigma _{0}(1-\epsilon )}{\sigma _{0}(1+\epsilon )}\right) }^{-n} \int _{{U_{n}}^{C}\cap \mathcal {F}_{n}} \mathbb {E}_{\varvec{\zeta },\sigma _{0}(1+\epsilon )} [(1-\varPhi _{n})] \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }) \\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} \sup _{\zeta \in U_n^C \cap \mathcal {F}_n} \mathbb {E}_{\zeta ,\sigma _{0}(1+\epsilon )}[( 1-\varPhi _{n})] \\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} C_2 e^{-c_2 n} = C_2 e^{-\tilde{c}_\epsilon n}, \end{aligned}$$

where \(\tilde{c}_\epsilon = c_2 + \log (1-\epsilon ) - \log (1+\epsilon )\) together with condition (iii) of (A2) implies \(\tilde{c}_\epsilon > 0\). Thus

$$\begin{aligned} P_0 \bigg \{\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \ge e^{-\tilde{c}_\epsilon \frac{n}{2}} \bigg \}&\le C_2 e^{\tilde{c}_\epsilon \frac{n}{2}}e^{-\tilde{c}_\epsilon n} \\&= C_2e^{-\tilde{c}_\epsilon \frac{n}{2}}. \end{aligned}$$

Therefore, for any \(\epsilon > 0\) so that \(\epsilon < {\tilde{\epsilon }}_2\) there exists a constant \(\tilde{c}_\epsilon \) for which the first Borel-Cantelli Lemma implies

$$\begin{aligned} e^{\tilde{c}_\epsilon \frac{n}{4}}\mathbf {I}_{1n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

Since this does not depend on \((\varvec{\theta }, \varvec{\phi })\), it implies (32).

Step 3) If we proceed as in the step 2), the Fubini’s theorem implies

$$\begin{aligned}&\mathbb {E}_{\zeta _{0},\sigma _0}(\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ))\\&= \mathbb {E}_{\zeta _{0},\sigma _0} \bigg [\int _{U_n^c \cap \mathcal {F}_n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,{\hat{\sigma }}_{n})}{p(y_i|\zeta _{0,i},\sigma _0)} 1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \epsilon \}} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\bigg ] \\&\le {\left( \frac{\sigma _{0}(1-\epsilon )}{\sigma _{0}(1+\epsilon )}\right) }^{-n} \int _{{U_{n}}^{C}\cap \mathcal {F}_{n}^C} \mathbb {E}_{\zeta ,\sigma _{0}(1+\epsilon )} [1] \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\le {\left( \frac{1-\epsilon }{1+\epsilon }\right) }^{-n} \varPi (\mathcal {F}^C_n|\varvec{\theta }, \varvec{\phi }). \end{aligned}$$

The condition (ii) of (A2) and the first Borel-Cantelli Lemma implies that for any \(\epsilon < \frac{1 - e^{-c_1}}{1 + e^{-c_1}}\):

$$\begin{aligned} \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } e^{\tilde{k}_\epsilon \frac{n}{4}}\mathbf {I}_{2n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n, \epsilon ) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0, \end{aligned}$$

where \(\tilde{k}_\epsilon = c_1 + \log (1 -\epsilon ) - \log (1+\epsilon )\).

Step 4) To prove (34), given any \(0<\rho < 1\), we first observe the following:

$$\begin{aligned}&\mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)\\&\ge \mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n)1_{ \{\left| \frac{{\hat{\sigma }}_{n}}{\sigma _0}-1\right| \le \rho \}} \\&\ge {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi }). \end{aligned}$$

Let us now define \(\log _+(x)= \max \{0, \log (x)\}\) and \(\log _-(x)= -\min \{0, \log (x)\}\) as well as

$$\begin{aligned} W_{i}&=\log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|{\zeta }_{i},\sigma _{0}(1-\rho ))}, \\ K_{i}^{+}(\zeta _0,\zeta )&= \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \text {d}y_{i},\\ K_{i}^{-}(\zeta _0,\zeta )&= \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \log _{-} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \text {d}y_{i}. \end{aligned}$$

Then we get

$$\begin{aligned}&\mathbb {V}ar_{\zeta _0, \sigma _0}(W_i) = \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) - \{ K_{i}^{+}(\zeta _0,\zeta ) \}^2\\&\quad \le \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\le \mathbb {E}_{\zeta _0, \sigma _0}(W_i^2) \\&\qquad + \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \left( \log _{-} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) ^{2} \text {d}y_{i} \\&\qquad - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\quad = \int p(y_{i}|{\zeta _{0,i}},\sigma _{0}) \left( \log \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) ^{2} \text {d}y_{i} \\&\qquad - \{ K_{i}(\zeta _0,\zeta ) \}^2\\&\quad = V_i(\zeta _0,\zeta ). \end{aligned}$$

Hence, by condition (i) of (A1) for any \(\rho < {\tilde{\epsilon }}_1\) and \(\zeta \in B\)

$$\begin{aligned} {\sum _{i=1}^{\infty }\frac{\mathbb {V}ar_{\zeta _0, \sigma _0}(W_i)}{i^2} \le \sum _{i=1}^{\infty }\frac{V_i(\zeta _0,\zeta )}{i^2} < \infty ,} \end{aligned}$$

and by the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}(W_{i}-K_{i}^{+}(\zeta _0,\zeta )) \xrightarrow [\text {n}]{\quad } 0 \text { a.s. } P_0. \end{aligned}$$

As a result, for every \(\zeta \in B\), with \(P_0\) probability 1

$$\begin{aligned}&\liminf _{n \rightarrow \infty } \bigg (\frac{1}{n}\sum _{i=1}^{n} \log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \bigg ) \\&\quad =-\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} -\log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \right) \\&\quad =-\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} \log \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))} \right) \\&\quad \ge -\limsup _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} \log _{+} \frac{p(y_{i}|{\zeta _{0,i}},\sigma _{0})}{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}\right) \\&\quad =-\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}^{+}(\zeta _0,\zeta )\right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(S_0,S)+\frac{1}{n}\sum _{i=1}^{n} \sqrt{\frac{K_{i}(\zeta _0,\zeta )}{2}} \; \right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta )+\sqrt{\frac{1}{n}\sum _{i=1}^{n} \frac{K_{i}(\zeta _0,\zeta )}{2}} \; \right) . \end{aligned}$$

The fourth line follows from the almost sure convergence proved in the previous paragraph, and the second to last line follows from Amewou-Atisso et al. (2003). We now make use of the condition (ii) of (A1). Let us consider \(\beta > 0\) and select \(\varDelta \) so that \(\varDelta + \sqrt{\frac{\varDelta }{2}} \le \frac{\beta }{8}\) and also \(C = B \cap \{\zeta : K_i(\zeta _0,\zeta ) < \varDelta \ \text { for all }i\}\). By (A1) there exists \({\tilde{\epsilon }}_1\) so that for all \(0<\rho <{\tilde{\epsilon }}_1\) implies \(\varPi (C|\varvec{\theta }, \varvec{\phi }) > 0\). Therefore, for each \(\zeta \in C\)

$$\begin{aligned}&\liminf _{n \rightarrow \infty } \left( \frac{1}{n}\sum _{i=1}^{n} log \frac{p(y_{i}|\zeta _{i},\sigma _{0}(1-\rho ))}{p(y_{i}|{\zeta _{0,i}},\sigma _{0})} \right) \\&\quad \ge -\limsup _{n \rightarrow \infty }\left( \frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta )+\sqrt{\frac{1}{n}\sum _{i=1}^{n} \frac{K_{i}(\zeta _0,\zeta )}{2}} \;\right) \\&\quad \ge -(\varDelta +\sqrt{\frac{\varDelta }{2}}), \end{aligned}$$

since \(\frac{1}{n}\sum _{i=1}^{n} K_{i}(\zeta _0,\zeta ) <\varDelta \) for all \(\zeta \in C\). Finally, for any \(\rho < \min \{{\tilde{\epsilon }}_1, \frac{1-e^{\frac{-\beta }{8}}}{1+e^{\frac{-\beta }{8}}}\}\)

$$\begin{aligned}&\liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} \mathbf {I}_{3n}(y_1, \dots , y_n, \varvec{\theta }, \varvec{\phi }, {\hat{\sigma }}_n) \\&\quad \ge \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{\mathcal {F}} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad \ge \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \int _{C} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad \ge \int _{C} \liminf _{n \rightarrow \infty }e^{\frac{2n\beta }{8}} {\left( \frac{1-\rho }{1+\rho }\right) }^{n} \prod _{i = 1}^n\frac{p(y_i|\zeta _i,\sigma _0(1-\rho ))}{p(y_i|\zeta _{0,i},\sigma _0)} \text {d}\varPi (\varvec{\zeta }|\varvec{\theta }, \varvec{\phi })\\&\quad = \infty . \end{aligned}$$

Note that the actual bound on \(\mathbf {I}_{3n}\) does not depend on \((\varvec{\theta }, \varvec{\phi })\). Taking \(\epsilon < \min \{{\tilde{\epsilon }}_2, \frac{1 - e^{-c_1}}{1 + e^{-c_1}}\}\) concludes the proof.

Proof of Lemma 1

Theorem 5 of Ghosal and Roy (2006) implies that there exist positive constants \(C, d_1, \dots , d_p\) so that for \(i = 1, \dots , p\)

$$\begin{aligned} P\left( \sup _{ \varvec{t}\in [0,1]^p}|\zeta (\varvec{t})|>M_{n}\bigg | \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_0 \frac{M_{n}^2}{\rho _{0}^{2}(\varvec{\theta }, \varvec{\phi })}}, \\ P\left( \sup _{ \varvec{t}\in [0,1]^p}\Big |\frac{\partial }{\partial t_{i}}\zeta (\varvec{t})\Big |>M_{n}| \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_i \frac{M_{n}^2}{\rho _{i}^{2}(\varvec{\theta }, \varvec{\phi })}}. \end{aligned}$$

The continuity of \(\rho ^2_i(\varvec{\theta }, \varvec{\phi })\), for \(i = 0,\cdots , p\), on a compact set \(\varUpsilon \) implies that they are uniformly bounded. Therefore, there exist universal constants \({(c_{0,1},c_{0,2})}, \cdots \), \({(c_{p,1}, c_{p,2})}\) such that for \(i = 0, \cdots , p\),

$$\begin{aligned} 0 < {c_{i,1}} \le \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } |\rho _i^2(\varvec{\theta }, \varvec{\phi })| \le {c_{i,2}}. \end{aligned}$$

Hence, for \(i = 0, \cdots , p\),

$$\begin{aligned} \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } P\left( \sup _{ \varvec{t}\in [0,1]^p}|\zeta (\varvec{t})|>M_{n}\bigg | \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_0 \frac{M_{n}^2}{{c_{0,1}}}}, \\ \sup _{(\varvec{\theta }, \varvec{\phi }) \in \varUpsilon } P\left( \sup _{ \varvec{t}\in [0,1]^p}\Big |\frac{\partial }{\partial t_{i}}\zeta (\varvec{t})\Big |>M_{n}| \varvec{z}, \varvec{\theta }, \varvec{\phi }, \right)&\le Ce^{-d_i \frac{M_{n}^2}{{c_{i,1}}}}. \end{aligned}$$

Proof of Lemma 2

We shall first define some notation. Let \(0<r <\frac{\nu }{2}\) and \(t = \frac{r}{4}\). Let \(N_t = N(t, \mathcal {F}_n, \parallel \cdot \parallel _{\infty })\) be the covering number of \(\mathcal {F}_n\). In Theorem 2.7.1, van der Vaart and Wellner (1996) show that there exist a constant K so that \(\log N_t \le \frac{KM_n}{t^p}\) and therefore \(N_t = \mathcal {O}(M_n)\), where \(M_n = \mathcal {O}(n^\alpha )\) for \(\alpha \in (\frac{1}{2},1)\) according to the definition of the sieves. Let us consider \(\tau \in (\frac{\alpha }{2}, \frac{1}{2})\) and define \(c_n = n^{\tau }\) so that \(\log (N_t) = o(c_n^2)\). Moreover, let \(\zeta ^1, \dots , \zeta ^{N_t} \in \mathcal {F}_n\) be finitely many elements of the sieve so that for every \(\zeta \in \mathcal {F}_n\) there is \(i \in \{1, \dots , N_t\}\) satisfying \(\parallel \zeta - \zeta ^i \parallel _\infty < t\). This implies that if \(\zeta \in \mathcal {F}_n\) such that \(\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu \), then \(\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}\).

The next step in the proof is to construct a test for each \(\zeta ^i\) with the resulting functions \(\varPhi _n\) defined as a combination of the individual tests and showing that the probabilities of type I and type II errors satisfies the properties of the lemma. Let us recall that \(\zeta _j = \zeta (\varvec{t}_j)\) and \(\zeta _{0,j} = \zeta _0 (\varvec{t}_j)\). For an arbitrary \(\zeta \in \mathcal {F}_n\) such that \(\parallel \zeta - \zeta ^i \parallel _\infty < t\), let us define \(\zeta _{1,j} = \zeta ^i(\varvec{t}_j)\) and \(b_j = 1\) if \(\zeta _{1,j} > \zeta _{0,j}\) and \(-1\) otherwise. For any \(\nu > 0\), let \(\varPsi _n[\zeta , \nu ]\) be the indicator of set A defined as follows

$$\begin{aligned} A = \left\{ \sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) > 2c_{n}\sqrt{n} \right\} . \end{aligned}$$

The test functions \(\varPhi _n\) are then

$$\begin{aligned} \varPhi _{n}= \max _{1\le j \le N_{t}}\varPsi _{n}[\zeta ^{j},\frac{\nu }{2}]. \end{aligned}$$

Type I error) The Mill’s ratio implies

$$\begin{aligned} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPsi _{n})&=P_{0}\left[ \sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) > 2c_{n}\sqrt{n}\right] \\&=1-\varPhi (2c_n)\\&\le \frac{1}{2c_n\sqrt{2\pi }} e^{-2c_n^2}\\&\le e^{-2c_n^2}. \end{aligned}$$

The function \(\varPhi (\cdot )\) is the CDF of the standard normal distribution. Consequently, we have

$$\begin{aligned} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPhi _{n})&\le \sum _{j =1}^{N_t} \mathbb {E}_{\zeta _{0},\sigma _0}(\varPsi _{n}[\zeta ^{j},\frac{\nu }{2}])\\&\le N_t e^{-2c_n^2} = e^{\log (N_t)-2c_n^2}\\&\le e^{-c_n^2}, \end{aligned}$$

and

$$\begin{aligned} \sum _{n=1}^{\infty }\mathbb {E}_{\zeta _{0},\sigma _0}\varPhi _{n} < \infty . \end{aligned}$$

Type II error) It is sufficient to find i for which the probability of type II error of \(\varPsi _n[\zeta ^i, \frac{\nu }{2}]\), given an arbitrary \(\zeta \) in \(W^C_{\nu , n} \cap \mathcal {F}_n\), is sufficiently small. This is because the probability of type II error for the composite test \(\varPhi _n\) is no larger than the smallest of \(\varPsi _n[\zeta ^i, \frac{\nu }{2}]\). Note that here we assume \(\int |\zeta (\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \nu \), and then \(\int |\zeta ^i(\varvec{t}) - \zeta _0(\varvec{t})| \text {d}Q_n(\varvec{t}) > \frac{\nu }{2}\). For every \(r < \frac{\nu }{2}\), Choi (2005) show that

$$\begin{aligned} \sum _{j=1}^n |\zeta _{1,j} - \zeta _{0,j}| > rn. \end{aligned}$$

Let n be large enough so that \(4\sigma _0 c_n < r\sqrt{n}\), then for any \(0<\epsilon <1\)

$$\begin{aligned}&\mathbb {E}_{\zeta , \sigma _0 (1 + \epsilon )}(1 - \varPsi _n[\zeta ^i, \frac{\nu }{2}]) \\&= P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{0,j}}{\sigma _0}\right) \le 2c_{n}\sqrt{n}\Bigg ] \\&=P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [\frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0}\right) + \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{\zeta _{j}-\zeta _{1,j}}{\sigma _0}\right) \\&\quad + \frac{1}{\sqrt{n}}\sum _{j=1}^{n}\left| \frac{\zeta _{1,j}-\zeta _{0,j}}{\sigma _0}\right| \le 2 c_n \Bigg ]\\&\le P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [ \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0}\right) \le \frac{r \sqrt{n}}{4 \sigma _0} - \frac{r \sqrt{n}}{\sigma _0} + 2c_n\Bigg ]\\&\le P_{\zeta , \sigma _0 (1 + \epsilon )} \Bigg [ \frac{1}{\sqrt{n}}\sum _{j=1}^{n}b_{j}\left( \frac{y_{j}-\zeta _{j}}{\sigma _0 (1+\epsilon )}\right) \le - \frac{r \sqrt{n}}{4 \sigma _0 (1+\epsilon )} \Bigg ]\\&=\varPhi \left( - \frac{r \sqrt{n}}{4 \sigma _0 (1+\epsilon )} \right) \\&\le \frac{4\sigma _0(1+\epsilon )}{r \sqrt{2 \pi n}}e^{-\frac{nr^2}{32 \sigma _0^2 (1+\epsilon )^2}}. \end{aligned}$$

To establish the part (ii) of the lemma, we need to show that there exists \(0<{\tilde{\epsilon }}<1\) so that for any \(\epsilon <{\tilde{\epsilon }}\)

$$\begin{aligned} \frac{r^2}{32 \sigma _0^2 (1+\epsilon )^2} + \log \left( \frac{1-\epsilon }{1+\epsilon }\right) > 0. \end{aligned}$$
(35)

Take \(\kappa = \frac{r^2}{32 \sigma _0^2}\) and define \(b(\epsilon )\) to be the left hand side of (35),

$$\begin{aligned} b(\epsilon ) = \kappa \left( \frac{1}{(1+\epsilon )^2} + \frac{1}{\kappa } \log \left( \frac{1-\epsilon }{1+\epsilon }\right) \right) . \end{aligned}$$

The function \(b(\epsilon )\) is clearly continuous at \(\epsilon = 0\). Hence, for each \(\kappa > 0\), there exists \({\tilde{\epsilon }}\) such that for all \(0<\epsilon <{\tilde{\epsilon }}\), \(b(\epsilon ) > 0\).

Proof of Theorem 2

First, we show that \({\hat{\sigma }}_n^2\) is asymptotically unbiased. Note that

$$\begin{aligned} \mathbb {E}[(y_{i+1} - y_i)^2]&= [\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2 + \sigma _0^2\mathbb {E}[(\epsilon _{i+1} - \epsilon _i)^2] \\&= [\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2 + 2\sigma _0^2, \end{aligned}$$

because \(\epsilon _i {\mathop {\sim }\limits ^{i.i.d.}} \mathcal {N}(0,1)\). Consequently

$$\begin{aligned} \mathbb {E}({\hat{\sigma }}_n^2) =\frac{\sum _{i = 1}^{n-1}[\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2}{2(n - 1)} + \sigma _0^2. \end{aligned}$$
(36)

Since \(\zeta _0\) is continuously differentiable on the compact and convex set \(\varvec{\varOmega }\), it is also (globally) Lipschitz on \(\varvec{\varOmega }\) (e.g. Schaeffer and Cain (2016), Corollary 3.2.4), and there exist a real constant K so that

$$\begin{aligned} |\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})| \le K \sum _{j = 1}^ p |t_{i+1,j} - t_{i,j}|. \end{aligned}$$

Therefore, due to the design assumption (AD)

$$\begin{aligned} \begin{aligned} 0&\le \frac{\sum _{i = 1}^{n-1}[\zeta _0(\varvec{t}_{i + 1}) - \zeta _0(\varvec{t}_{i})]^2}{2(n - 1)} \\&\le \frac{K^2p^2}{2} \bigg [\sup _{i \in \{1, \dots , n\}, j \in \{1, \dots , p\}} |t_{i+1, j} - t_{i,j}|\bigg ]^2 \xrightarrow [\text {n}]{\quad \quad } 0, \end{aligned} \end{aligned}$$
(37)

and the combination of (36) with (37) implies

$$\begin{aligned} \mathbb {E}({\hat{\sigma }}_n^2) \xrightarrow [\text {n}]{\quad \quad } \sigma _0^2. \end{aligned}$$
(38)

To show the almost sure convergence of \({\hat{\sigma }}_n^2\), let us now denote \(x_i = (y_{i + 1} - y_i)^2\) and rewrite the estimator \({\hat{\sigma }}_n^2\) as a sum of two estimators, each consisting of a sum of independent variables:

$$\begin{aligned} {\hat{\sigma }}_n^2 = \frac{\frac{1}{2} \sum _{i = 1}^{\frac{n -1 }{2}} x_{2i} }{2\big (\frac{n -1 }{2}\big )} + \frac{\frac{1}{2} \sum _{j = 1}^{\frac{n -1 }{2}} x_{2j-1} }{2\big (\frac{n -1 }{2}\big )} = {\hat{\sigma }}_{n,e}^2 + {\hat{\sigma }}_{n,o}^2. \end{aligned}$$

Without loss of generality, we assumed that n is an odd integer. Lastly note that \(\mathbb {V}ar(x_i) \le C < \infty \) uniformly in i. This is because the differences \(\zeta _0(\varvec{t}_{i+1}) - \zeta _0(\varvec{t}_{i})\) are uniformly bounded on the compact set \(\varvec{\varOmega }\) due to the continuity of \(\zeta _0\). Additionally, \(y_{i+1} - y_i\) are normal and have bounded moments. We can now apply the Kolmogorov’s strong law of large numbers for independent non-identically distributed random variables (e.g. Shiryaev (1996), Chapter 3),

$$\begin{aligned}&{\hat{\sigma }}_{n,e}^2 \xrightarrow [\text {n}]{\quad \quad } \frac{1}{2}\sigma _0^2 \quad \text {a.s. } P_0 \\&{\hat{\sigma }}_{n,0}^2 \xrightarrow [\text {n}]{\quad \quad } \frac{1}{2}\sigma _0^2 \quad \text {a.s. } P_0 \end{aligned}$$

and as a result

$$\begin{aligned} {\hat{\sigma }}_n^2 = {\hat{\sigma }}_{n,e}^2 + {\hat{\sigma }}_{n,o}^2 \xrightarrow [\text {n}]{\quad \quad } \sigma _0^2 \quad \text {a.s. } P_0. \end{aligned}$$

The LDM calibration

The analysis of the LDM follows our previous study in Kejzlar and Maiti (2020). Here we provide a concise discussion regarding the choices of prior distributions for and the GP’s specification.

GP specifications. For the computer model \(E_{\text {B}}(Z,N)\), we consider the GP prior with the mean zero and the covariance function

$$\begin{aligned} \eta _E \cdot \text {exp}&(-\frac{(Z - Z' )^2}{2\nu ^2_Z} -\frac{(N - N')^2}{2\nu ^2_N} -\frac{(\theta _{\text {vol}} - \theta _{\text {vol}}')^2}{2\nu ^2_1} \\&- \frac{(\theta _{\text {surf}} - \theta _{\text {surf}}' )^2}{2\nu ^2_2} -\frac{(\theta _{\text {sym}} - \theta _{\text {sym}}' )^2}{2\nu ^2_3} -\frac{(\theta _{\text {C}} - \theta _{\text {C}}' )^2}{2\nu ^2_4}). \end{aligned}$$

We also assume the GP prior for the systematic discrepancy \(\delta (Z,N)\) with mean zero and covariance function

$$\begin{aligned} \eta _\delta \cdot \text {exp}(-\frac{(Z - Z' )^2}{2l^2_Z} -\frac{(N - N' )^2}{2l^2_N}). \end{aligned}$$

Prior distributions The prior distributions for the calibration parameters \(\varvec{\theta }\) are chosen to be wide enough to cover the space of all their reasonable values:

$$\begin{aligned} \theta _{\text {vol}}&\sim \mathcal {N}(15.42, 0.203), \\ \theta _{\text {surf}}&\sim \mathcal {N}(16.91, 0.645), \\ \theta _{\text {sym}}&\sim \mathcal {N}(22.47, 0.525), \\ \theta _{\text {C}}&\sim \mathcal {N}(0.69, 0.015). \end{aligned}$$

The prior distributions for the hyperparameters \(\varvec{\phi }\) were selected as \(Gamma(\alpha , \beta )\) with the shape parameter \(\alpha \) and scale parameter \(\beta \). They are chosen to be weakly informative so that they correspond to the scale of these parameters given by the literature on nuclear mass models (Weizsäcker 1935; Bethe and Bacher 1936; Myers and Swiatecki 1966; Fayans 1998; Kirson 2008; McDonnell et al. 2015; Kortelainen et al. 2010, 2012, 2014; Benzaid et al. 2020; Kejzlar et al. 2020). In particular,

$$\begin{aligned} \sigma&\sim Gamma(2,1),\\ \eta _\delta&\sim Gamma(10,1),\\ l_Z&\sim Gamma(10,1), \\ l_N&\sim Gamma(10,1), \\ \nu _Z&\sim Gamma(10,1), \\ \nu _N&\sim Gamma(10,1), \\ \nu _i&\sim Gamma(10,1), i = 1,2,3,4. \end{aligned}$$

Since the majority of the masses in the training dataset are larger than 1000 MeV. We consider the following prior for \(\eta _f\) to reflect this notion

$$\begin{aligned} \eta _f \sim Gamma(110,10). \end{aligned}$$

Numerical study of the conditional covariance \(k_\zeta \)

Here we present the numerical investigation of our conjecture about the asymptotic behavior of the conditional covariance \(k_{\zeta }\). We show that with increasing number of model evaluations s (assuming some space filling design) the covariance function \(k_{\delta }\) quickly dominates which strongly points out to similar asymptotic behavior of \(k_{\zeta }\) and \(k_{\delta }\) with respect to s. Our rational is that by informing the prior distribution for \(\zeta \) with more model evaluations, we effectively reduce the uncertainty about the computer model.

1.1 Study design

We consider a simple scenario with the joint space of model and calibration inputs over \([0,1]^2\). The input pairs \((\tilde{t}_j, {\tilde{\theta }}_j)\) were generated using the space filling Latin hypercube design. The true value of calibration parameter was chosen to vary between \(\theta = \{0.3, 0.5, 0.8\}\).

Fig. 6
figure 6

The absolute difference between the conditional kernel \(k_{\zeta }\) and \(k_{\delta }\) for the model inputs \(t_i = 0.2\) and \(t_j = 0.4\) and the value of true calibration parameter \(\theta = \{0.3, 0.5, 0.8\}\). This is the squared exponential kernel case

Fig. 7
figure 7

The absolute difference between the conditional kernel \(k_{\zeta }\) and \(k_{\delta }\) for the model inputs \(t_i = 0.3\) and \(t_j = 0.7\) and the value of true calibration parameter \(\theta = \{0.3, 0.5, 0.8\}\). This is the squared exponential kernel case

Fig. 8
figure 8

The absolute difference between the conditional kernel \(k_{\zeta }\) and \(k_{\delta }\) for the model inputs \(t_i = 0.2\) and \(t_j = 0.4\) and the value of true calibration parameter \(\theta = \{0.3, 0.5, 0.8\}\). This is the tensor-product Matérn case

Fig. 9
figure 9

The absolute difference between the conditional kernel \(k_{\zeta }\) and \(k_{\delta }\) for the model inputs \(t_i = 0.3\) and \(t_j = 0.7\) and the value of true calibration parameter \(\theta = \{0.3, 0.5, 0.8\}\). This is the tensor-product Matérn case

Fig. 10
figure 10

Comparison of the convergence to the true physical process \(\zeta _0(t,x)\) under the empirical Bayes approach and the fully Bayesian implementation given by the Metropolis-Hastings algorithm. The dashed line represents the true process \(\zeta _0\), and the solid line corresponds to the mean of posterior predictive distributions under respective method. The curves with \(95 \%\) credible intervals (shaded area) are plotted at \(t=0.00\)

Fig. 11
figure 11

Comparison of the convergence to the true physical process \(\zeta _0(t,x)\) under the empirical Bayes approach and the fully Bayesian implementation given by the Metropolis-Hastings algorithm. The dashed line represents the true process \(\zeta _0\), and the solid line corresponds to the mean of posterior predictive distributions under respective method. The curves with \(95 \%\) credible intervals (shaded area) are plotted at \(t=0.43\)

Fig. 12
figure 12

Comparison of the convergence to the true physical process \(\zeta _0(t,x)\) under the empirical Bayes approach and the fully Bayesian implementation given by the Metropolis-Hastings algorithm. The dashed line represents the true process \(\zeta _0\), and the solid line corresponds to the mean of posterior predictive distributions under respective method. The curves with \(95 \%\) credible intervals (shaded area) are plotted at \(t=0.71\)

1.2 Results

Figures 6 and 7 show the values of \(| k_{\zeta }(t_i,t_j) - k_{\delta }(t_i,t_j)|\) as a function of model runs in the case of squared exponential covariance kernels for both \(k_f\) and \(k_{\delta }\). The hyperparameter values were fixed to \(\eta _f = \eta _{\delta } = 1\) with varying values for the length scales so that \(l_f = l_{\delta }\).

Fig. 13
figure 13

Comparison of the convergence to the true physical process \(\zeta _0(t,x)\) under the empirical Bayes approach and the fully Bayesian implementation given by the Metropolis-Hastings algorithm. The dashed line represents the true process \(\zeta _0\), and the solid line corresponds to the mean of posterior predictive distributions under respective method. The curves with \(95 \%\) credible intervals (shaded area) are plotted at \(t=1.00\)

Fig. 14
figure 14

Details of \(95 \%\) credible bands of posterior predictive distributions under the empirical Bayes approach and the fully Bayesian approach of Metropolis-Hastings algorithm. These were plotted at \(t=0.00\)

Fig. 15
figure 15

Details of \(95 \%\) credible bands of posterior predictive distributions under the empirical Bayes approach and the fully Bayesian approach of Metropolis-Hastings algorithm. These were plotted at \(t=0.43\)

Analogically, Figs. 8 and 9 correspond to the case of tensor-product Matérn kernels with the standard choice of smoothness parameter \(\lambda = 2.5\) which is sufficient according to the theory discussed in Sect. 4.1.1. The remaining hyperparameters are same as in the case of squared exponential kernel.

We can see that both choices of kernel functions for \(k_f\) and \(k_{\delta }\) exhibit the hypothesized dominance of \(k_{\delta }\) with the increasing number of model runs. This happens irrespective of the choice of kernel function, model inputs t, calibration parameters \(\theta \), and the length scales. Particularly in the case of squared exponential kernel, the absolute difference between \(k_{\zeta }\) and \(k_{\delta }\) quickly decreases and reaches the limits of numerical stability. On the other hand, the rate of convergence is considerably slower for the Matérn kernel which is likely related to the limited smoothness of the kernel.

Additional results for simulation study: Transverse harmonic wave

The following figures shows additional results of the empirical Bayes fit under the transverse harmonic wave simulation study at the time locations \(t=0\), \(t=0.43, t=0.71\), and \(t=1\) (Figs. 10, 11, 12, 13, 14, 15, 16, 17).

Fig. 16
figure 16

Details of \(95 \%\) credible bands of posterior predictive distributions under the empirical Bayes approach and the fully Bayesian approach of Metropolis-Hastings algorithm. These were plotted at \(t=0.71\)

Fig. 17
figure 17

Details of \(95 \%\) credible bands of posterior predictive distributions under the empirical Bayes approach and the fully Bayesian approach of Metropolis-Hastings algorithm. These were plotted at \(t=1.00\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kejzlar, V., Son, M., Bhattacharya, S. et al. A fast and calibrated computer model emulator: an empirical Bayes approach. Stat Comput 31, 49 (2021). https://doi.org/10.1007/s11222-021-10024-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-021-10024-8

Keywords

Navigation