1 Introduction

In this paper, our aim is to investigate the fast mean-reverting volatility asymptotics for an SPDE-based structural model for portfolio credit. SPDEs arising from large portfolio limits of collections of defaultable constant volatility models were initially studied in Bush et al. [5], and their regularity was further investigated in Ledger [21]. In Hambly and Kolliopoulos [1517], we extended this work to a two-dimensional stochastic volatility setting, and here we consider the question of effective one-dimensional constant volatility approximations which arise by considering fast mean-reversion in the volatilities. This approach is to some extent motivated by the ideas analysed in Fouque et al. [11, Chaps. 3–11], but instead of option prices, we look at the systemic risk of large credit portfolios in the fast mean-reverting volatility setting.

The literature on large portfolio limit models in credit can be divided into two approaches based on either structural or reduced form models for the individual assets. Our focus will be on the structural approach, where we assume that we are modelling the financial health of the firms directly and default occurs when these health processes hit a lower barrier.

The reduced form setting assumes that the default of each firm occurs as a Poisson process and we model the default intensities directly. These can be correlated through systemic factors and through the losses from the portfolio. The evolution of the large portfolio limit of the empirical measure of the loss can be analysed as a law of large numbers and then Gaussian fluctuations derived around this limit; see Giesecke et al. [13, 24, 25, 12] and Cvitanić et al. [6]. Further, the large deviations can be analysed; see Sowers and Spiliopoulos [26, 27]. It is also possible to take an approach through interacting particle systems where each firm is in one of two states representing financial health and financial distress and there is a movement between states according to some intensity, often firm dependent, and dependent on the proportion of losses; see for instance Dai Pra and Tolotti [8] or Dai Pra et al. [7].

Our underlying set up is a structural model for default in which each asset has a distance to default, which we think of as the logarithmically scaled asset price process. The asset price evolves according to a general stochastic volatility model, in which the distance to default of the \(i\)th asset \(X^{i}\) satisfies the system

$$\begin{aligned} dX_{t}^{i} &= \bigg(r_{i}-\frac{h^{2}(\sigma _{t}^{i})}{2}\bigg)\,dt+h( \sigma _{t}^{i})\Big(\sqrt{1-\rho _{1,i}^{2}} \, dW_{t}^{i}+\rho _{1,i} \, dW_{t}^{0}\Big),\quad 0\leq t\leq T_{i}, \\ d\sigma _{t}^{i} &= k_{i}(\theta _{i}-\sigma _{t}^{i}) \, dt+\xi _{i}g( \sigma _{t}^{i})\Big(\sqrt{1-\rho _{2,i}^{2}} \, dB_{t}^{i}+\rho _{2,i} \, dB_{t}^{0}\Big),\quad t\geq 0, \\ X_{t}^{i} &= 0,\quad t > T_{i} := \inf \{ t\geq 0: X_{t}^{i}=0\}, \\ (X_{0}^{i}, \sigma _{0}^{i}) &= (x^{i},\sigma ^{i, \mathrm{init}}), \end{aligned}$$
(1.1)

for all \(i \in \mathbb{N}\). The coefficient vectors \(C_{i} = (r_{i}, \rho _{1,i}, \rho _{2,i}, k_{i}, \theta _{i}, \xi _{i})\) are picked randomly and independently from some probability distribution with \(\rho _{1,i}, \rho _{2,i} \in [0, 1)\), the infinite sequence \(((x^{1}, \sigma ^{1, \mathrm{init}}), (x^{2}, \sigma ^{2, \mathrm{init}}), \dots )\) of random vectors in \(\mathbb{R}^{2}\) is assumed to be exchangeable (that is, the joint distribution of every finite subset is invariant under permutations), and \(g,h\) are functions for which we give suitable conditions later. The exchangeability condition implies (by de Finetti’s theorem, see Kotelenez and Kurtz [19, Theorem 4.1] and Bernardo and Smith [1, Chap. 4] for a proof) the existence of a \(\sigma \)-algebra \(\mathcal{G} \subseteq \sigma (\{(x^{i}, \sigma ^{i}): i \in \mathbb{N}\})\), given which the two-dimensional random vectors \((x^{i}, \sigma ^{i})\) are pairwise independent and identically distributed. The idiosyncratic Brownian motions \(W^{i}, B^{i}\) for \(i\in \mathbb{N}\) are taken to be pairwise independent, and also independent of the systemic Brownian motions \(W^{0}, B^{0}\) which have a constant correlation \(\rho _{3}\).

We regard this as a system for \(Z^{i}=(X^{i},\sigma ^{i})\) with

$$ dZ^{i} = b^{i}(Z^{i})\, dt + \Sigma ^{i}(Z^{i}) \,d{\mathbf{W}^{i}}, \qquad Z^{i}_{0} = (x^{i},\sigma ^{i, \mathrm{init}}) $$

for \(t< T_{i}\), where

$$\begin{aligned} b^{i}(X,\sigma ) &= \bigg(r_{i}-\frac{h(\sigma )^{2}}{2}, k_{i}( \theta _{i}-\sigma )\bigg)^{\top }, \\ \Sigma ^{i}(X,\sigma ) &= \Biggl( \textstyle\begin{array}{c@{\quad }c@{\quad }c@{\quad }c} h(\sigma )\sqrt{1-\rho _{1,i}^{2}} & h(\sigma )\rho _{1,i} & 0 & 0 \\ 0 & 0 & \xi _{i}g (\sigma ) \sqrt{1-\rho _{2,i}^{2}} & \xi _{i}g ( \sigma ) \rho _{2,i} \end{array}\displaystyle \Biggr) \end{aligned}$$

and \(\mathbf{W}^{i} = (W^{i}, W^{0}, B^{i}, B^{0})^{\top }\). Then the infinitesimal generator of the above two-dimensional process is given by

$$ \mathcal{A}^{i}f = \sum _{j=1}^{2} b_{j}^{i} \frac{\partial f}{\partial x_{j}} + \frac{1}{2} \sum _{j,k=1}^{2} a_{jk}^{i} \frac{\partial ^{2}f}{\partial x_{j}\partial x_{k}} $$

for \(f\in C^{2}(\mathbb{R}_{+}\times \mathbb{R}; \mathbb{R})\). The matrix \(A^{i} = (a_{jk}^{i})\) is given by

$$ A^{i} = \left ( \textstyle\begin{array}{c@{\quad }c} h(\sigma )^{2} & h(\sigma )\xi _{i}g(\sigma ) \rho _{1,i}\rho _{2,i} \rho _{3} \\ h(\sigma )\xi _{i}g(\sigma ) \rho _{1,i}\rho _{2,i}\rho _{3} & \xi _{i}^{2} g(\sigma )^{2} \end{array}\displaystyle \right ), $$

as \(A^{i} = \Sigma ^{i} R (\Sigma ^{i})^{\top }\) with \(R\) the covariance matrix for the 4-dimensional Brownian motion \(\mathbf{W}^{i}\).

We can show that the empirical measure of a sequence of finite sub-systems,

$$ \nu ^{N}_{t} = \frac{1}{N} \sum _{i=1}^{N} \delta _{X^{i}_{t},\sigma ^{i}_{t}}, $$

converges weakly as \(N\to \infty \) (see [17]) to the probability distribution of \(Z_{t}^{1}\) given \(W^{0}\), \(B^{0}\) and \(\mathcal{G}\). This measure consists of two parts: its restriction to the line \(x= 0\) which is approximated by the restriction of \(\nu ^{N}\) to this line, and its restriction to \(\mathbb{R}_{+} \times \mathbb{R}\) which possesses a two-dimensional density \(u(t,x,y)\). The density \(u(t,x,y)\) can be regarded as an average of solutions to certain two-dimensional SPDEs with a Dirichlet boundary condition on the line \(x=0\). In particular, we can write \(u = \mathbb{E}[u_{C_{1}} \, | \, W^{0}, B^{0}, \mathcal{G}]\), where \(u_{C_{1}}(t,x,y)\) is the probability density of \(Z_{t}^{1}\) given \(W^{0}\), \(B^{0}\), \(\mathcal{G}\) and \(C_{1}\) on \(\mathbb{R}_{+} \times \mathbb{R}\), which satisfies, for any value of the coefficient vector \(C_{1}\), the two-dimensional SPDE

$$ du_{C_{1}} = \mathcal{A}^{1,*} u_{C_{1}} \,dt + \mathcal{B}^{1,*}u_{C_{1}} \,d(W^{0},B^{0})^{\top }, $$
(1.2)

where \(\mathcal{A}^{1,*}\) is the adjoint of the generator \(\mathcal{A}^{1}\) of \(Z^{1}\) and the operator \(\mathcal{B}^{1,*}\) is given by

$$ \mathcal{B}^{1,*}f = \left (-\rho _{1,1}h(y) \frac{\partial f}{\partial x}, -\xi _{1}\rho _{2,1}g\left (y\right ) \frac{\partial f}{\partial y} \right ). $$

The boundary condition is that \(u_{C_{1}}(t,0,y)=0\) for all \(y\in \mathbb{R}\). In the special case where the coefficients are constants independent of \(i\), \(u\) is itself a solution to the stochastic partial differential equation (1.2).

One reason for studying the large portfolio limit is the need to have a useful approximation which captures the dynamics among the asset prices when the number of assets is large. Moreover, by studying the limit SPDE instead of a finite sub-system of (1.1), we can potentially provide a more efficient approach to capturing the key drivers of a large portfolio without having to simulate a large number of idiosyncratic Brownian paths.

Of central importance will be the loss process \(L\), the value of which at each time \(t > 0\) is given by

$$ L_{t} = \mathbb{P} [X_{t}^{1} = 0 \, | \, W^{0}, B^{0}, \, \mathcal{G}], $$

i.e., the mass on the line \(x=0\) of the probability distribution of \(Z_{t}^{1}\) given \(W^{0}\), \(B^{0}\) and \(\mathcal{G}\). This quantity is approximated, as \(N \rightarrow \infty \), by the mass of \(v_{t}^{N}\) on the line \(x=0\), which is equal to the proportion of defaulted assets at time \(t\) in the finite sub-system of size \(N\), and thus it measures the total loss in the large portfolio limit at time \(t\). The distribution of \(L\) is a simple measure of risk for the portfolio of assets and can be used to find the probability of a large loss, or to determine the prices of portfolio credit derivatives such as CDOs that can be written as expectations of suitable functions of \(L\). Thus our focus will be on estimating probabilities of the form

$$\begin{aligned} \mathbb{P}[L_{t} \in (1 - b, 1 - a )] = \mathbb{P}\big[\mathbb{P}[X_{t}^{1} > 0 \, | \, W^{0}, B^{0}, \mathcal{G}] \in (a, b )\big] \end{aligned}$$
(1.3)

for some \(0 \leq a < b \leq 1\), that is, the probability that the total loss from the portfolio lies within a certain range. Probabilities of the above form can be approximated numerically with a simulated sample of values of \(L_{t}\), obtained via

$$\begin{aligned} 1 - L_{t} =& \mathbb{P} [X_{t}^{1} > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] \\ =& \int _{0}^{+\infty }\int _{0}^{+\infty }\mathbb{E} [u_{C_{1}}(t,x,y) \, | \, W^{0}, B^{0}, \mathcal{G} ]\,dx\,dy \\ \approx & \frac{1}{n}\sum _{i=1}^{n}{\int _{0}^{+\infty }\int _{0}^{+ \infty }u_{c_{1,i}}(t,x,y)\,dx\,dy} \end{aligned}$$
(1.4)

after solving the SPDE (1.2) for \(u_{C_{1}}\) numerically, for a sample \(\{c_{1,1}, \dots , c_{1,n} \}\) of values of the vector \(C_{1}\). In the special case when asset prices are modelled as simple constant volatility models, the numerics (see Giles and Reisinger [14] or Bujok and Reisinger [4] for jump-diffusion models) have a significantly smaller computational cost, which motivates the investigation of the existence of accurate approximations using a constant volatility setting in the general case. We also note that one-dimensional SPDEs describing large portfolio limits in constant volatility environments have been found to have a unique regular solution (see Bush et al. [5] or Hambly and Ledger [18] for a loss-dependent correlation model), an important component of the numerical analysis and a counterpoint to the fact that we have been unable to establish uniqueness of solutions to the two-dimensional SPDE arising in the CIR volatility case [15].

We derive our one-dimensional approximations under two different settings with fast mean-reverting volatility. In what we call the large vol-of-vol setting, the mean-reversion and volatility in the second equation in (1.1) are scaled by suitable powers of \(\epsilon \) in that \(k_{i}=\kappa _{i}/\epsilon \) and \(\xi _{i} = v_{i}/\sqrt{\epsilon }\), giving

$$ d\sigma _{t}^{i} = \frac{\kappa _{i}}{\epsilon }(\theta _{i}-\sigma _{t}^{i}) \,dt+\frac{v_{i}}{\sqrt{\epsilon }} g(\sigma _{t}^{i})\Big(\sqrt{1-\rho _{2,i}^{2}}\,dB_{t}^{i}+ \rho _{2,i}\,dB_{t}^{0}\Big),\qquad t\geq 0, $$

and then we take \(\epsilon \to 0\). This is distributionally equivalent to speeding up the volatility processes by scaling the time \(t\) by \(\epsilon \), when \(\epsilon \) is small. Our aim is to take the limit as \(\epsilon \to 0\), so that when the system of volatility processes is positive recurrent, averages over finite time intervals involving the sped-up volatility processes will approximate the corresponding stationary means. In the limit, we obtain a constant volatility large portfolio model which could be used as an effective approximation when volatilities are fast mean-reverting. However, this speeding up does not lead to strong convergence of the volatility processes, allowing only weak convergence of our system, which can only be established when \(\rho _{3} = 0\) (effectively separating the time scales) and when \((\kappa _{i}, \theta _{i}, v_{i}, \rho _{2,i})\) is the same constant vector \((\kappa , \theta , v, \rho _{2})\) for all \(i \in \mathbb{N}\).

The case of small vol-of-vol has the mean-reversion in the second equation in (1.1) scaled by \(\epsilon \) in that \(k_{i}=\kappa _{i}/\epsilon \) and

$$ d\sigma _{t}^{i} = \frac{\kappa _{i}}{\epsilon }(\theta _{i}-\sigma _{t}^{i}) \,dt+\xi _{i} g (\sigma _{t}^{i} )\Big(\sqrt{1-\rho _{2,i}^{2}}\,dB_{t}^{i}+ \rho _{2,i}\,dB_{t}^{0}\Big),\qquad t\geq 0. $$

We regard this case as a small noise perturbation of the constant volatility model, where volatilities have stochastic behaviour but are pulled towards their mean as soon as they move away from it due to a large mean-reverting drift. When \(\epsilon \to 0\), the drifts of the volatilities tend to infinity and dominate the corresponding diffusion parts since the vol-of-vols remain small, allowing the whole system to converge to a constant volatility setting in a strong sense. This strong convergence allows the rate of convergence of probabilities of the form (1.3) to be estimated and gives us a quantitative measure of the loss in accuracy in the estimation of these probabilities when a constant volatility large portfolio model is used to replace a more realistic stochastic volatility perturbation of that model.

In Sects. 2 and 3, we present our main results for both settings. The results are then proved in Sects. 4 and 5. Finally, the proofs of two propositions showing the positive recurrence, and hence the applicability of our results, for two classes of models can be found in the Appendix.

2 The main results: large vol-of-vol setting

We begin with the study of the fast mean-reversion/large vol-of-vol setting, for which we need to assume that the correlation \(\rho _{3}\) of \(W^{0}\) and \(B^{0}\) is zero. When \(g\) is either the square root function or a function behaving almost like a positive constant for large values of the argument, it has been proved in [15, Theorem 4.3] and in [17, Theorem 4.1], respectively, that

$$ u_{C_{1}}(t, x, y) = p_{t}(y|B^{0},\mathcal{G} ) \mathbb{E}\big[u \big(t,x,W^{0},\mathcal{G},C_{1},h (\sigma ^{1} )\big)\,\big| \,W^{0}, \sigma _{t}^{1}= y,B^{0},C_{1},\mathcal{G}\big], $$

where \(p_{t}\) is the density of each volatility path when the path of \(B^{0}\) is given, and \(u(t,x,W^{0},\mathcal{G},C_{1},h(\sigma ^{1}))\) is the unique \(H_{0}^{1}\left (0, +\infty \right )\)-solution to the SPDE

$$\begin{aligned} u(t, x) =& u_{0}(x) - \int _{0}^{t}\bigg(r- \frac{h^{2} (\sigma _{s}^{1} )}{2}\bigg)u_{x}(s, x)\,ds \\ & +\int _{0}^{t}\frac{h^{2} (\sigma _{s}^{1} )}{2} u_{xx}(s, x)\,ds- \rho _{1,1}\int _{0}^{t}h (\sigma _{s}^{1} )u_{x}(s, x)\,dW_{s}^{0}, \end{aligned}$$
(2.1)

where \(u_{0}\) is the density of each \(x^{i}\) given \(\mathcal{G}\). In the above expression for the two-dimensional density \(u_{C_{1}}(t, x, y)\), averaging happens with respect to the idiosyncratic noises, and since we are interested in probabilities concerning \(L_{t}\) which is computed by substituting that density in (1.4), averaging happens with respect to the market noise \((W^{0}, B^{0})\) as well. Therefore, we can replace \((W^{i}, B^{i})\) for all \(i \geq 0\) in our system by objects having the same joint law. In particular, setting \(k_{i}=\kappa _{i}/\epsilon \) and \(\xi _{i} = v_{i}/\sqrt{\epsilon }\), the \(i\)th asset’s distance to default \(X^{i, \epsilon }\) satisfies the system

$$\begin{aligned} X_{t}^{i, \epsilon } &= x^{i} + \int _{0}^{t}\bigg(r_{i}- \frac{h^{2}(\sigma _{s}^{i, \epsilon })}{2}\bigg)\,ds \\ &\phantom{=:}+ \int _{0}^{t}h(\sigma _{s}^{i, \epsilon })\Big(\sqrt{1-\rho _{1,i}^{2}} \,dW_{s}^{i}+\rho _{1,i}\,dW_{s}^{0}\Big),\qquad 0\leq t\leq T_{i}^{\epsilon }, \\ \sigma _{t}^{i, \epsilon } &= \sigma ^{i, \mathrm{init}} + \frac{\kappa _{i}}{\epsilon }\int _{0}^{t}(\theta _{i}-\sigma _{s}^{i, \epsilon })\,ds + \frac{v_{i}}{\sqrt{\epsilon }}\int _{0}^{t}g ( \sigma _{s}^{i, \epsilon } )\,d\Big(\sqrt{1-\rho _{2,i}^{2}}B_{s}^{i} + \rho _{2,i}B_{s}^{0}\Big), \\ X_{t}^{i, \epsilon } &= 0,\qquad t > T_{i}^{\epsilon } := \inf \{ t \geq 0: X_{t}^{i, \epsilon }=0\}, \end{aligned}$$

where the superscripts \(\epsilon \) are used to underline the dependence on \(\epsilon \). If we substitute \(t = \epsilon t'\) and \(s = \epsilon s'\) for \(0 \leq s' \leq t'\) and then replace \((W^{i}, B^{i})\) by \((W^{i}, \sqrt{\epsilon }B_{\frac{\cdot }{\epsilon }}^{i})\) for all \(i \geq 0\) which have the same joint law, the SDE satisfied by the \(i\)th volatility process becomes

$$ \sigma _{\epsilon t'}^{i, \epsilon } = \sigma ^{i, \mathrm{init}} + \kappa _{i}\int _{0}^{t'}(\theta _{i}-\sigma _{\epsilon s'}^{i, \epsilon })\,ds' + v_{i}\int _{0}^{t'}g (\sigma _{\epsilon s'}^{i, \epsilon } )\,d\Big(\sqrt{1-\rho _{2,i}^{2}}B_{s'}^{i} + \rho _{2,i}B_{s'}^{0} \Big). $$

This shows that \(\sigma ^{i,\epsilon } = \sigma _{\epsilon \frac{\cdot }{\epsilon }}^{i, \epsilon }\) can be replaced by \(\sigma _{\frac{\cdot }{\epsilon }}^{i,1}\) for all \(i \geq 1\), i.e., the \(i\)th volatility process of our model when the mean-reversion coefficient and the vol-of-vol are equal to \(\kappa _{i}\) and \(v_{i}\), respectively, and when the time \(t\) is scaled by \(\epsilon \), speeding up the system of the volatilities when \(\epsilon \) is small.

If \(g\) is now chosen so that the system of volatility processes becomes positive recurrent, averages over finite time intervals converge to the corresponding stationary means as the speed tends to infinity, i.e., as \(\epsilon \to 0+\), which is the key for the convergence of our system. We give a definition of the required property for \(g\).

Definition 2.1

We fix the distribution from which each \(C_{i}' = (r_{i}, \rho _{1,i}, \rho _{2,i}, \kappa _{i}, \theta _{i}, v_{i})\) is chosen and denote by \(\mathcal{C}\) the \(\sigma \)-algebra generated by all these coefficient vectors. Then we say that \(g\) has the positive recurrence property when the two-dimensional process \((\sigma _{\cdot }^{i,1}, \sigma _{\cdot }^{j,1})\) is a positive recurrent diffusion for any two \(i, j \in \mathbb{N}\), for almost all values of \(C_{i}'\) and \(C_{j}'\). This means that given \(\mathcal{C}\), there exists a two-dimensional random variable \((\sigma ^{i, j, 1, *}, \sigma ^{i, j, 2, *})\) whose distribution is stationary for \((\sigma _{\cdot }^{i,1}, \sigma _{\cdot }^{j,1})\), and whenever \(\mathbb{E}[|F(\sigma ^{i, j, 1, *}, \sigma ^{i, j, 2, *})| \, | \, \mathcal{C}]\) exists and is finite for some measurable function \(F: \mathbb{R}^{2} \rightarrow \mathbb{R}\), we also have

$$ \lim _{T \rightarrow \infty }\frac{1}{T}\int _{0}^{T}F(\sigma _{s}^{i,1}, \sigma _{s}^{j,1})\,ds = \mathbb{E}[F(\sigma ^{i, j, 1, *}, \sigma ^{i, j, 2, *} ) \, | \, \mathcal{C} ], $$

or equivalently, after a change of variables,

$$ \lim _{\epsilon \rightarrow 0{+}}\frac{1}{t}\int _{0}^{t}F (\sigma _{ \frac{s}{\epsilon }}^{i,1}, \sigma _{\frac{s}{\epsilon }}^{j,1} )\,ds = \mathbb{E} [F (\sigma ^{i, j, 1, *}, \sigma ^{i, j, 2, *} ) \, | \, \mathcal{C} ] $$

for any \(t \geq 0\), ℙ-almost surely.

The positive recurrence property is a prerequisite for our convergence results to hold, and now we state two propositions which give us a few classes of models for which this property is satisfied. The first shows that for the Ornstein–Uhlenbeck model (\(g(x) = 1\) for all \(x \in \mathbb{R}\)), we always have the positive recurrence property. The second shows that for the CIR model (\(g(x) = \sqrt{|x|}\) for all \(x \in \mathbb{R}\)), we have the positive recurrence property provided that the random coefficients of the volatilities satisfy certain conditions. The proofs of both propositions can be found in the Appendix.

Proposition 2.2

Suppose that\(g\)is a differentiable function, bounded from below by some\(c_{g} > 0\). Suppose also that\(g'(x)\kappa _{i}(\theta _{i} - x) < \kappa _{i}g(x) + \frac{v_{i}}{2}g''(x)g^{2}(x)\)for all\(x \in \mathbb{R}\)and\(i \in \mathbb{N}\), for all possible values of\(C_{i}\). Then\(g\)has the positive recurrence property.

Proposition 2.3

Suppose that\(g(x) = \sqrt{|x|}\,\tilde{g}(x)\), where the function\(\tilde{g}\)is a continuously differentiable, strictly positive and increasing function taking values in\([c_{g}, 1]\)for some\(c_{g} > 0\). Then there exists an\(\eta > 0\)such that\(g\)has the positive recurrence property when\(\Vert C_{i} - C_{j} \Vert _{L^{\infty }(\mathbb{R}^{6})} < \eta \)and\(\frac{\kappa _{i}}{v_{j}^{2}} > \frac{1}{4} + \frac{1}{\sqrt{2}}\)for all\(i, j \in \mathbb{N}\), ℙ-almost surely.

We can now proceed to our main results, which will be governed by the conditional moments \(\sigma _{1,1} = \mathbb{E}[h(\sigma ^{1,1,1,*}) \, | \, \mathcal{C}]\) and \(\sigma _{2,1} = \sqrt{\mathbb{E}[h^{2}(\sigma ^{1,1,1,*}) \, | \, \mathcal{C}]}\) as well as the quantity \(\tilde{\sigma } = \sqrt{\mathbb{E}[h(\sigma ^{1, 2, 1, *})h(\sigma ^{1, 2, 2, *}) \, | \, \mathcal{C}]}\), where \(\sigma ^{1, 1, 1, *}\), \(\sigma ^{1, 2, 1, *}\) and \(\sigma ^{1, 2, 2, *}\) are given in Definition 2.1. The next theorem implies the weak convergence of the loss \(L_{t}^{\epsilon } = 1 - \mathbb{P}[X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G}]\) under the fast mean-reverting volatility setting to the loss under an appropriate constant volatility setting.

Theorem 2.4

Suppose that\((\kappa _{i}, \theta _{i}, v_{i}, \rho _{2,i}) = (\kappa , \theta , v, \rho _{2})\)for all\(i \in \mathbb{N}\), which is a deterministic vector in\(\mathbb{R}^{4}\), the function\(h\)is bounded, and\(g\)has the positive recurrence property, in which case we have\(\sigma _{1,1} = \mathbb{E}[h(\sigma ^{1,1,1,*})]\), \(\sigma _{2,1} = \sqrt{\mathbb{E}[h^{2}(\sigma ^{1,1,1,*})]}\)and\(\tilde{\sigma } = \sqrt{\mathbb{E}[h(\sigma ^{1, 2, 1, *})h(\sigma ^{1, 2, 2, *})]}\). Consider now the one-dimensional large portfolio model where the distance to default\(X^{i, *}\)of the\(i\)th asset evolves in time according to the system

$$\begin{aligned} X_{t}^{i, *} &= x^{i} + \bigg(r_{i} - \frac{\sigma _{2,1}^{2}}{2} \bigg)t + \tilde{\rho }_{1,i}\sigma _{2,1}W_{t}^{0} + \sqrt{1 - { \tilde{\rho }_{1,i}}^{2}}\sigma _{2,1}W_{t}^{i}, \qquad 0 \leq t \leq T_{i}^{*}, \\ X_{t}^{i, *} &= 0, \qquad t \geq T_{i}^{*} := \inf \{ t\geq 0: X_{t}^{i, *}=0\}, \end{aligned}$$

where\(\tilde{\rho }_{1,i} = \rho _{1,i} \frac{\tilde{\sigma }}{\sigma _{2,1}}\). Then we have the convergence

$$ \mathbb{P} [X_{t}^{1, \epsilon } \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] \longrightarrow \mathbb{P} [X_{t}^{1, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ] $$

in distribution as\(\epsilon \rightarrow 0{+}\), for any interval\(\mathcal{I} = \left (0, U\right ]\)with\(U \in \left (0, +\infty \right ]\).

Remark 2.5

Since all volatility processes have the same stationary distribution, a simple application of the Cauchy–Schwarz inequality shows that \(\tilde{\sigma } \leq \sigma _{2,1}\), which implies that \(\tilde{\rho _{1,i}} \leq \rho _{1,i} < 1\) so that \(\sqrt{1 - {\tilde{\rho }_{1,i}}^{2}}\) is well defined for each \(i\).

The above theorem gives only weak convergence and only under the restrictive assumption of having the same coefficients in each volatility. For this reason, we also study the asymptotic behaviour of our system from a different perspective. In particular, we fix the volatility path \(\sigma ^{1,1}\) and the coefficient vectors \(C_{i}'\) and study the convergence of the solution \(u^{\epsilon }(t, x)\) to the SPDE (2.1) in the sped-up setting, i.e.,

$$\begin{aligned} u^{\epsilon }(t, x) =& u_{0}(x) - \int _{0}^{t}\bigg(r- \frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2}\bigg)u_{x}^{ \epsilon }(s, x)\,ds \\ & +\int _{0}^{t} \frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2} u_{xx}^{ \epsilon }(s, x)\,ds-\rho _{1,1}\int _{0}^{t}h (\sigma _{ \frac{s}{\epsilon }}^{1,1} )u_{x}^{\epsilon }(s, x)\,dW_{s}^{0}, \end{aligned}$$
(2.2)

which is used to compute the loss \(L_{t}^{\epsilon }\).

We now write \(\mathbb{E}_{\sigma , \mathcal{C}}\) to denote the expectation given the volatility path \(\sigma ^{1,1}\) and the \(C_{i}'\) which we have fixed, and \(L_{\sigma , \mathcal{C}}^{2}\) to denote the corresponding \(L^{2}\)-norms. By part 2 of Theorem 4.1 in [15], the solution \(u^{\epsilon }\) to the above SPDE satisfies the identity

$$ \Vert u^{\epsilon }(t, \cdot ) \Vert _{L^{2}(\mathbb{R}_{+})}^{2}+ (1- \rho _{1,1}^{2} )\int _{0}^{t}h^{2} (\sigma _{\frac{t}{\epsilon }}^{1,1} ) \Vert u_{x}^{\epsilon }(s, \cdot )\Vert _{L^{2}(\mathbb{R}_{+})}^{2} \,ds= \Vert u_{0} \Vert _{L^{2}(\mathbb{R}_{+})}^{2}, $$
(2.3)

which shows that the \(L^{2}(\mathbb{R}_{+})\)-norm of \(u^{\epsilon }(t, \, \cdot )\) and the \(L^{2}([0, \, T] \times \mathbb{R}_{+})\)-norm of \(u^{\epsilon }\) are both uniformly bounded by a random variable which has a finite \(L_{\sigma , \mathcal{C}}^{2}(\Omega )\)-norm (the initial data assumptions made in [15] are also needed for this to hold), for all \(0 \leq t \leq T\). Therefore, since \(L^{2}\)-spaces are reflexive by Brézis [3, Theorem 4.10], [3, Theorem 3.18] tells us that for a given sequence of values of \(\epsilon \) tending to zero, we can always find a subsequence \((\epsilon _{n})_{n \in \mathbb{N}}\) and an element \(u^{*} \in L_{\sigma , \mathcal{C}}^{2}([0, \, T] \times \mathbb{R}_{+} \times \Omega )\) such that \(u^{\epsilon _{n}} \rightarrow u^{*}\) weakly in \(L_{\sigma , \mathcal{C}}^{2}([0, \, T] \times \mathbb{R}_{+} \times \Omega )\) as \(n \rightarrow \infty \). The characterisation of the weak limits \(u^{*}\) is given in the following theorem.

Theorem 2.6

Suppose that\(g\)has the positive recurrence property and that\(|h(x)| \,{\leq}\, C\)for all\(x \in \mathbb{R}\), for some\(C > 0\). Then whenever we have that\(u^{\epsilon _{n}} \rightarrow u^{*}\)weakly in\(L_{ \sigma , \mathcal{C}}^{2}([0, \, T] \times \mathbb{R}_{+} \times \Omega )\)for some sequence\((\epsilon _{n})_{n \in \mathbb{N}} \subseteq \mathbb{R}_{+}\)with\(\epsilon _{n} \rightarrow 0{+}\), the weak limit\(u^{*}\)is a weak solution to the SPDE

$$\begin{aligned} u^{*}(t,x) &= u_{0}(x) - \bigg(r-\frac{\sigma _{2,1}^{2}}{2}\bigg) \int _{0}^{t}u_{x}^{*}(s,\,x)\,ds \\ &\phantom{=:} + \frac{\sigma _{2,1}^{2}}{2}\int _{0}^{t} u_{xx}^{*}(s,x)\,ds-\rho _{1,1} \sigma _{1,1}\int _{0}^{t}u_{x}^{*}(s,x)\,dW_{s}^{0}. \end{aligned}$$
(2.4)

Furthermore, if we have\(|h(x)| > c\)for all\(x \in \mathbb{R}\)for some\(c > 0\), there is always a subsequence\((\epsilon _{k_{n}})_{n \in \mathbb{N}}\)of\((\epsilon _{n})_{n \in \mathbb{N}}\)such that\(u^{\epsilon _{k_{n}}} \rightarrow u^{*}\)weakly in the smaller space\(H_{0}^{1}(\mathbb{R}_{+})\times L_{\sigma , \mathcal{C}}^{2}(\Omega \times [0, \, T])\), in which (2.4) has a unique solution. In that case, since\((\epsilon _{n})_{n \in \mathbb{N}}\)can be taken to be a subsequence of an arbitrary sequence of values of\(\epsilon \)tending to zero, we have that\(u^{\epsilon }\)converges weakly in\(H_{0}^{1}(\mathbb{R}_{+})\times L_{\sigma , \mathcal{C}}^{2}(\Omega \times [0, \, T])\)to the unique solution to (2.4) in that space as\(\epsilon \rightarrow 0{+}\).

It is not hard to see that the limiting SPDE (2.4) obtained in Theorem 2.6 corresponds to a constant volatility large portfolio model like the one given in Theorem 2.4 under the assumption that \((\kappa _{i}, \theta _{i}, v_{i}, \rho _{2,i}) = (\kappa , \theta , v, \rho _{2})\), but with the correlation coefficients \(\tilde{\rho }_{1,i} = \rho _{1,i} \frac{\tilde{\sigma }}{\sigma _{2,1}}\) replaced by \(\rho _{1,i}' = \rho _{1,i}\frac{\sigma _{1,1}}{\sigma _{2,1}}\). This indicates that the convergence of the loss \(L_{t}^{\epsilon }\) can only be established in a weak sense, as in general we will have \(\tilde{\sigma } > \sigma _{1,1}\) and thus \(\tilde{\rho }_{1,i} > \rho _{1,i}'\) for all \(i\). This is stated explicitly in the next proposition and its corollary.

Proposition 2.7

Under the assumptions of Theorem 2.4, we always have that\(\tilde{\sigma }\)lies in\([\sigma _{1,1}, \sigma _{2,1}]\). The lower and upper bounds are generally attained only when the volatilities are uncorrelated\((\rho _{2} = 0)\)and perfectly correlated\((\rho _{2} \to 1)\), respectively.

Corollary 2.8

In general, the convergence established in Theorem 2.4does not hold in any stronger sense, unless there is no market noise affecting all the volatilities in our setting.

3 The main results: small vol-of-vol setting

We now proceed to the small vol-of-vol setting, where only the volatility drifts are scaled by \(\epsilon \), i.e., \(k_{i} = \kappa _{i}/\epsilon \) for all \(i\). This leads to the model where the \(i\)th asset’s distance to default satisfies

$$\begin{aligned} X_{t}^{i, \epsilon } &= x^{i} + \int _{0}^{t}\bigg(r_{i}- \frac{h^{2}(\sigma _{s}^{i, \epsilon })}{2}\bigg)\,ds \\ &\phantom{=:} + \int _{0}^{t}h(\sigma _{s}^{i, \epsilon })\Big(\sqrt{1-\rho _{1,i}^{2} \,}\,dW_{s}^{i}+\rho _{1,i}\,dW_{s}^{0}\Big),\qquad 0\leq t\leq T_{i}^{ \epsilon }, \\ \sigma _{t}^{i, \epsilon } &= \sigma ^{i, \mathrm{init}} + \int _{0}^{t} \frac{\kappa _{i}}{\epsilon }(\theta _{i}-\sigma _{s}^{i, \epsilon }) \,ds+\xi _{i} \int _{0}^{t} g (\sigma _{t}^{i, \epsilon } )\Big(\sqrt{1- \rho _{2,i}^{2}}\,dB_{s}^{i}+\rho _{2,i}\,dB_{s}^{0}\Big), \\ X_{t}^{i, \epsilon } &= 0,\qquad t > T_{i}^{\epsilon } := \inf \{s \geq 0: X_{s}^{i, \epsilon } \leq 0\}. \end{aligned}$$

The main feature of the above model is that when the random coefficients and the function \(g\) satisfy certain conditions, the \(i\)th volatility process \(\sigma ^{i, \epsilon }\) converges in a strong sense to the \(\mathcal{C}\)-measurable mean \(\theta _{i}\) as \(\epsilon \rightarrow 0{+}\) for all \(i \in \mathbb{N}\), and we can also determine the rate of convergence. The required conditions are the following, and they are assumed to hold throughout the rest of this section:

1) The i.i.d. random variables \(\sigma ^{i}, \xi _{i}, \theta _{i}, \kappa _{i}\) take values in some compact subinterval of ℝ, with each \(\kappa _{i}\) being bounded from below by some deterministic constant \(c_{\kappa } > 0\).

2) \(g\) is a \(C^{1}\)-function with at most linear growth (i.e., \(\left |g(x)\right | \leq C_{1,g} + C_{2,g}|x|\) for some \(C_{1,g}, C_{2,g} > 0\) and all \(x \in \mathbb{R}\)).

3) Both the function \(h\) and its derivative have polynomial growth.

Under the above conditions, the convergence of each volatility process to its mean is given in the following proposition.

Proposition 3.1

For any\(t \geq 0\)and\(p \geq 1\), we have\(\sigma ^{i, \epsilon } \rightarrow \theta _{i}\)as\(\epsilon \rightarrow 0{+}\)in\(L^{p}(\Omega \times [0, t])\)at a rate of\(\epsilon ^{\frac{1}{p}}\), that is, \(\Vert \sigma ^{i, \epsilon } - \theta _{i} \Vert _{L^{p}(\Omega \times [0, t])}^{p} = \mathcal{O}(\epsilon )\)as\(\epsilon \rightarrow 0{+}\).

The reason for having only weak convergence of our system in the large vol-of-vol setting was the fact that the limiting quantities \(\sigma _{1,1}\), \(\sigma _{2,1}\) and \(\tilde{\sigma }\) did not coincide. On the other hand, Proposition 3.1 implies that the corresponding limits in the small vol-of-vol setting are equal, allowing us to hope for our system to converge in a stronger sense.

Let \(u^{\epsilon }\) be the solution to the SPDE (2.1) in the small vol-of-vol setting,

$$\begin{aligned} u^{\epsilon }(t, x) =& u_{0}(x) - \int _{0}^{t}\bigg(r- \frac{h^{2} (\sigma _{s}^{1,\epsilon } )}{2}\bigg)u_{x}^{\epsilon }(s, x)\,ds \\ & +\int _{0}^{t}\frac{h^{2} (\sigma _{s}^{1,\epsilon } )}{2} u_{xx}^{ \epsilon }(s, x)\,ds-\rho _{1,1}\int _{0}^{t}h (\sigma _{s}^{1, \epsilon } )u_{x}^{\epsilon }(s, x)\,dW_{s}^{0}, \end{aligned}$$
(3.1)

where we have fixed the volatility paths and the random coefficients. Working as in the proof of Theorem 2.3, it is possible to establish similar asymptotic properties for the SPDE (3.1) as \(\epsilon \to 0+\). However, we are going to work with the antiderivative \(v^{0, \epsilon }\) defined by \(v^{0, \epsilon }(t, x) = \int _{x}^{+\infty }u^{\epsilon }(t, y)\,dy\) for all \(t, x \geq 0\) which satisfies the same SPDE but with different initial and boundary conditions. This is more convenient since the loss \(L_{t}^{\epsilon } = 1 - \mathbb{P}[X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G}]\) equals the expectation of \(v^{0, \epsilon }(t, 0)\) given \(W^{0}\), \(B^{0}\) and \(\mathcal{G}\) (that is, the average over all possible volatility paths and coefficient values), while the convergence of \(v^{0, \epsilon }\) can be established in a much stronger sense and without the need to assume that \(W^{0}\) and \(B^{0}\) are uncorrelated. Our main result is stated below.

Theorem 3.2

Define\(v^{0}(t, x) := \int _{x}^{+\infty }u^{0}(t, y)\,dy\)for all\(t, x \geq 0\), where\(u^{0}\)is the unique solution to the SPDE

$$\begin{aligned} u^{0}(t, x) =& u_{0}(x) - \int _{0}^{t}\left (r- \frac{h^{2}\left (\theta _{1}\right )}{2}\right )u_{x}^{0}(s, x)\,ds \\ & +\int _{0}^{t}\frac{h^{2}\left (\theta _{1}\right )}{2} u_{xx}^{0}(s, x)\,ds-\rho _{1,1}\int _{0}^{t}h\left (\theta _{1}\right )u_{x}^{0}(s, x)\,dW_{s}^{0} \end{aligned}$$
(3.2)

in\(L^{2}(\Omega \times [0, T] ; H_{0}^{1}(\mathbb{R}_{+}))\) (see Bush et al. [5] and Ledger [21]), which arises from the constant volatility model

$$\begin{aligned} dX_{t}^{i, *} &= \bigg(r_{i}-\frac{h^{2}(\theta _{i})}{2}\bigg)\,dt+h( \theta _{i})\Big(\sqrt{1-\rho _{1,i}^{2}}\,dW_{t}^{i}+\rho _{1,i}\,dW_{t}^{0} \Big),\qquad 0\leq t\leq T_{i}, \\ X_{t}^{i, *} &= 0,\qquad t > T_{i} := \inf \{s \geq 0: X_{s}^{i, *} \leq 0\}, \\ X_{0}^{i, *} &= x^{i}, \end{aligned}$$
(3.3)

for\(i \in \mathbb{N}\). Then\(v^{0, \epsilon }\)converges to\(v^{0}\)as\(\epsilon \rightarrow 0{+}\)strongly in the Sobolev space\(L^{2}(\Omega \times [0, T] ; H^{1}(\mathbb{R}_{+}))\)for any\(T > 0\)and the rate of convergence is\(\sqrt{\epsilon }\), that is, \(\Vert v^{0, \epsilon } - v^{0} \Vert _{L^{2}(\Omega \times [0, T] ; H^{1}( \mathbb{R}_{+}))} = \mathcal{O}(\sqrt{\epsilon })\)as\(\epsilon \to 0+\).

The SPDE (3.2) corresponds to the model (3.3) in the sense that given the loss \(L_{t}\), the mass \(1 - L_{t}\) of non-defaulted assets equals

$$ \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] = \mathbb{E} [v^{0} (t, 0 ) \, | \, W^{0}, \mathcal{G} ] = \int _{0}^{+ \infty }\mathbb{E} [u^{0} (t, x ) \, | \, W^{0}, \mathcal{G} ]\,dx. $$

In order to estimate the rate of convergence of probabilities of the form (1.3), we consider the approximation error

$$ E (x, T ) = \int _{0}^{T}\big|\mathbb{P}\big[\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] > x\big] - \mathbb{P}\big[\mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] > x\big]\big|\,dt $$

for \(x \in [0, 1]\) and determine its order of convergence.

Corollary 3.3

For any\(x \in [0, 1]\)such that\(\mathbb{P}[X_{t}^{1, *} > 0 \, | \, W^{0}, C_{1}', \mathcal{G}]\)has a bounded density near\(x\), uniformly in\(t \in [0, T]\), we have\(E(x, T) =\mathcal{O}(\epsilon ^{1/3})\)as\(\epsilon \to 0{+}\).

4 Proofs: large vol-of-vol setting

We prove Theorems 2.4 and 2.6, Proposition 2.7 and Corollary 2.8, the main results of Sect. 2.

Proof of Theorem 2.4

To establish convergence in distribution, we show that for every bounded and continuous function \(G: \mathbb{R} \rightarrow \mathbb{R}\), we have

$$ \mathbb{E}\big[G (\mathbb{P} [X_{t}^{1, \epsilon } \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] ) \big] \longrightarrow \mathbb{E} \big[G (\mathbb{P} [X_{t}^{1, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ] ) \big] $$
(4.1)

as \(\epsilon \rightarrow 0{+}\), where \(\mathcal{I} = (0, U]\). Observe now that since the conditional probabilities take values in the compact interval \([0, 1]\), it is equivalent to have (4.1) for all continuous \(G: [0, 1] \rightarrow \mathbb{R}\), and by the Weierstrass approximation theorem and linearity, we actually need to have this only when \(G\) is a polynomial of the form \(G(x)=x^{m}\).

We now write \(Y^{i, \epsilon }\) for the \(i\)th asset’s distance to default in the sped-up volatility setting when the stopping condition at zero is ignored, that is,

$$\begin{aligned} Y_{t}^{i, \epsilon } =& x^{i} + \int _{0}^{t}\bigg(r_{i}- \frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{i,1} )}{2}\bigg)\,ds \\ & +\int _{0}^{t}h(\sigma _{\frac{s}{\epsilon }}^{i,1})\rho _{1,i}\,dW_{s}^{0} +\int _{0}^{t}h(\sigma _{\frac{s}{\epsilon }}^{i,1})\sqrt{1-\rho _{1,i}^{2}} \,dW_{s}^{i} \end{aligned}$$

with

$$\begin{aligned} \sigma _{t}^{i,1} =& \sigma ^{i, \mathrm{init}} + \kappa \int _{0}^{t} ( \theta - \sigma _{s}^{i,1} )\,ds \\ & + v\int _{0}^{t}g (\sigma _{s}^{i,1} )\rho _{2}\,dB_{s}^{0} + v \int _{0}^{t}g (\sigma _{s}^{i,1} )\sqrt{1 - \rho _{2}^{2}}\,dB_{s}^{i} \end{aligned}$$

for all \(t \geq 0\), and then we have \(X_{t}^{i, \epsilon } = Y_{t \wedge T_{i}^{\epsilon }}^{i, \epsilon }\). The \(m\) stochastic processes \(X^{i, \epsilon }\), \(1 \leq i \leq m\), are obviously pairwise i.i.d. when the information contained in \(W^{0}, B^{0}\) and \(\mathcal{G}\) is given. Therefore we can write, with \(G(x)=x^{m}\),

$$\begin{aligned} & \mathbb{E}\big[G (\mathbb{P} [X_{t}^{1, \epsilon } \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] )\big] \\ & = \mathbb{E}\big[\mathbb{P}^{m} [X_{t}^{1, \epsilon } \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] \big] \\ & = \mathbb{E}\big[\mathbb{P} [X_{t}^{1, \epsilon } \in \mathcal{I}, X_{t}^{2, \epsilon } \in \mathcal{I}, \dots , X_{t}^{m, \epsilon } \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] \big] \\ & = \mathbb{P} [X_{t}^{1, \epsilon } \in \mathcal{I}, X_{t}^{2, \epsilon } \in \mathcal{I}, \dots , X_{t}^{m, \epsilon } \in \mathcal{I} ] \\ & = \mathbb{P}\Big[\Big(\min _{1 \leq i \leq m}\min _{0 \leq s \leq t}Y_{s}^{i, \epsilon } , \max _{1 \leq i \leq m}Y_{t}^{i, \epsilon } \Big) \in (0, +\infty )\times (-\infty , U ] \Big]. \end{aligned}$$
(4.2)

Next, for each \(i\), we write \(Y^{i, *}\) for the process \(X^{i, *}\) when the stopping condition at zero is ignored, that is,

$$\begin{aligned} Y_{t}^{i, *} = X_{0}^{i} + \bigg(r_{i} - \frac{\sigma _{2,1}^{2}}{2} \bigg)t + \tilde{\rho }_{1,i}\sigma _{2,1}W_{t}^{0} + \sqrt{1 - { \tilde{\rho }_{1,i}}^{2}}\sigma _{2,1}W_{t}^{i} \end{aligned}$$

for all \(t \geq 0\), with \(\tilde{\rho }_{1,i} = \rho _{1,i} \frac{\tilde{\sigma }}{\sigma _{2,1}}\). Again, it is easy to check that the processes \(Y^{i,*}\) are pairwise i.i.d. when the information contained in \(W^{0}, B^{0}\) and \(\mathcal{G}\) is given. Thus we can write

$$\begin{aligned} & \mathbb{E}\big[G (\mathbb{P} [X_{t}^{1, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ] )\big] \\ & = \mathbb{E}\big[\mathbb{P}^{m} [X_{t}^{1, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ] \big] \\ & = \mathbb{E}\big[\mathbb{P} [X_{t}^{1, *} \in \mathcal{I}, X_{t}^{2, *} \in \mathcal{I}, \dots , X_{t}^{m, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ] \big] \\ & = \mathbb{P} [X_{t}^{1, *} \in \mathcal{I}, X_{t}^{2, *} \in \mathcal{I}, \dots , X_{t}^{m, *} \in \mathcal{I} ] \\ & = \mathbb{P}\Big[\Big(\min _{1 \leq i \leq m}\min _{0 \leq s \leq t}Y_{s}^{i, *} , \max _{1 \leq i \leq m}Y_{t}^{i, *} \Big) \in (0, +\infty ) \times (-\infty , U ] \Big]. \end{aligned}$$
(4.3)

Then (4.2) and (4.3) show that the result we want to prove has been reduced to the convergence

$$ \Big(\min _{1 \leq i \leq m}\min _{0 \leq s \leq t}Y_{s}^{i, \epsilon } , \max _{1 \leq i \leq m}Y_{t}^{i, \epsilon } \Big) \longrightarrow \Big(\min _{1 \leq i \leq m}\min _{0 \leq s \leq t}Y_{s}^{i, *} , \max _{1 \leq i \leq m}Y_{t}^{i, *} \Big) $$

in distribution as \(\epsilon \rightarrow 0{+}\) (since the probability that any of the \(m\) minima equals zero is zero, as the minimum of any Gaussian process is always continuously distributed, while \(Y^{i, \epsilon }\) is obviously Gaussian for any given path of \(\sigma ^{i,1}\)).

Let \(C([0, t]; \mathbb{R}^{m})\) be the classical Wiener space of continuous functions defined on \([0, t]\) and taking values in \(\mathbb{R}^{m}\) (i.e., the space of these functions equipped with the supremum norm and the Wiener measure), and observe that \(\min _{1 \leq i \leq m}p_{i}(\min _{0 \leq s \leq t}\cdot (s))\) defined on \(C([0, t]; \mathbb{R}^{m})\), where \(p_{i}\) stands for the projection on the \(i\)th axis, is a continuous functional. Indeed, for any \(f_{1}, f_{2}\) in \(C([0, t]; \mathbb{R}^{m})\), we have

$$ \Big|\min _{1 \leq i \leq m}p_{i}\Big(\min _{0 \leq s \leq t}f_{1}(s) \Big) - \min _{1 \leq i \leq m}p_{i}\Big(\min _{0 \leq s \leq t}f_{2}(s) \Big)\Big| = \big|p_{i_{1}}\big(f_{1}(s_{1})\big) - p_{i_{2}}\big(f_{2}(s_{2}) \big)\big| $$

for some \(s_{1}, s_{2} \in [0, t]\) and \(1 \leq i_{1}, i_{2} \leq m\), and without loss of generality, we may assume that the difference inside the last absolute value is nonnegative. Moreover, we have

$$\begin{aligned} p_{i_{1}}\big(f_{1}(s_{1})\big) = \min _{1 \leq i \leq m}p_{i}\Big( \min _{0 \leq s \leq t}f_{1}(s)\Big) \leq p_{i_{2}}\big(f_{1}(s_{2}) \big) \end{aligned}$$

and thus

$$\begin{aligned} \Big|\min _{1 \leq i \leq m}p_{i}\Big(\min _{0 \leq s \leq t}f_{1}(s) \Big) - \min _{1 \leq i \leq m}p_{i}\Big(\min _{0 \leq s \leq t}f_{2}(s) \Big)\Big| = &p_{i_{1}}\big(f_{1}(s_{1})\big) - p_{i_{2}}\big(f_{2}(s_{2}) \big) \\ \leq & p_{i_{2}}\big(f_{1}(s_{2})\big) - p_{i_{2}}\big(f_{2}(s_{2}) \big) \\ \leq & \big|p_{i_{2}}\big(f_{1}(s_{2})\big) - p_{i_{2}}\big(f_{2}(s_{2}) \big)\big| \\ \leq & \Vert f_{1} - f_{2} \Vert _{C ( [0, t ]; \mathbb{R}^{m} )}. \end{aligned}$$

Clearly, \(\max _{1 \leq i \leq m}p_{i}(\cdot (t))\) defined on \(C([0, t]; \mathbb{R}^{m})\) is also continuous (as the maximum of finitely many evaluation functionals). Thus, our problem is finally reduced to showing that \((Y^{1, \epsilon }, Y^{2, \epsilon }, \dots , Y^{m, \epsilon })\) converges in distribution to \((Y^{1, *}, Y^{2, *}, \dots , Y^{m, *})\) in the space \(C([0, t]; \mathbb{R}^{m})\), as \(\epsilon \rightarrow 0{+}\).

In order to show the convergence in distribution, we first establish that a limit in distribution exists as \(\epsilon \rightarrow 0{+}\) by using a tightness argument, and then we characterise the limits of the finite-dimensional distributions. To show tightness of the laws of \((Y^{1, \epsilon }, Y^{2, \epsilon }, \dots , Y^{m, \epsilon })\) for \(\epsilon \in \mathbb{R}_{+}\), which implies the desired convergence in distribution, we recall a special case of Ethier and Kurtz [9, Theorem 3.7.2] for continuous processes, according to which it suffices to prove that for a given \(\eta > 0\), there exist some \(\delta > 0\) and \(N > 0\) such that

$$ \mathbb{P} [ | (Y_{0}^{1, \epsilon }, Y_{0}^{2, \epsilon }, \dots , Y_{0}^{m, \epsilon } ) |_{\mathbb{R}^{m}} > N ] \leq \eta $$
(4.4)

and

$$\begin{aligned} \mathbb{P}\Big[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } &| (Y_{s_{1}}^{1, \epsilon }, Y_{s_{1}}^{2, \epsilon }, \dots , Y_{s_{1}}^{m, \epsilon } ) \\ & - (Y_{s_{2}}^{1, \epsilon }, Y_{s_{2}}^{2, \epsilon }, \dots , Y_{s_{2}}^{m, \epsilon } ) |_{\mathbb{R}^{m}} > \eta \Big] \leq \eta \end{aligned}$$
(4.5)

for all \(\epsilon > 0\). (4.4) can easily be achieved for some very large \(N > 0\), since \((Y_{0}^{1, \epsilon }, Y_{0}^{2, \epsilon }, \dots , Y_{0}^{m, \epsilon }) = (x^{1}, x^{2}, \dots , x^{m})\), which is independent of \(\epsilon \) and almost surely finite (the sum over \(n \in \mathbb{N}\) of the probabilities that the norm of this vector belongs to \(\left [n, n+1 \right ]\) is a convergent series and thus the same sum but for \(n \geq N\) tends to zero as \(N\) tends to infinity). For (4.5), observe that \(| \cdot |_{\mathbb{R}^{m}}\) can be any of the standard equivalent \(L^{p}\)-norms on \(\mathbb{R}^{m}\), and we choose it to be \(L^{\infty }\). Then we have

$$\begin{aligned} & \mathbb{P}\Big[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } | (Y_{s_{1}}^{1, \epsilon }, Y_{s_{1}}^{2, \epsilon }, \dots , Y_{s_{1}}^{m, \epsilon } ) - (Y_{s_{2}}^{1, \epsilon }, Y_{s_{2}}^{2, \epsilon }, \dots , Y_{s_{2}}^{m, \epsilon } ) |_{\mathbb{R}^{m}} > \eta \Big] \\ & = \mathbb{P}\bigg[\bigcup _{i=1}^{m}\Big\{ \sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } |Y_{s_{1}}^{i, \epsilon } - Y_{s_{2}}^{i, \epsilon } | > \eta \Big\} \bigg] \\ & \leq \sum _{i=1}^{m}\mathbb{P}\Big[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } |Y_{s_{1}}^{i, \epsilon } - Y_{s_{2}}^{i, \epsilon } | > \eta \Big] \\ & = m\mathbb{P}\Big[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } |Y_{s_{1}}^{1, \epsilon } - Y_{s_{2}}^{1, \epsilon } | > \eta \Big] , \end{aligned}$$
(4.6)

and since the Itô integral \(\int _{0}^{t}h(\sigma _{\frac{s}{\epsilon }}^{1, 1})(\sqrt{1-\rho _{1}^{2}} \,dW_{s}^{1}+\rho _{1}\,dW_{s}^{0})\) can be written as \(\tilde{W}_{\int _{0}^{t}h^{2}(\sigma _{\frac{s}{\epsilon }}^{1,1})\,ds}\), where \(\tilde{W}\) is another standard Brownian motion, denoting the maximum of \(h\) by \(M\) also gives

$$\begin{aligned} & \mathbb{P}\Big[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta } |Y_{s_{1}}^{1, \epsilon } - Y_{s_{2}}^{1, \epsilon } | > \eta \Big] \\ & = \mathbb{P}\bigg[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta }\bigg|\int _{s_{2}}^{s_{1}}\bigg(r- \frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2}\bigg)\,ds \\ & \qquad \qquad \qquad \qquad \qquad +\bigg(\tilde{W}_{ \int _{0}^{s_{1}}h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )\, ds} - \tilde{W}_{\int _{0}^{s_{2}}h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )\,ds}\bigg)\bigg| > \eta \bigg] \\ & \leq \mathbb{P}\bigg[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta }\bigg|\int _{s_{2}}^{s_{1}}\bigg(r- \frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2}\bigg)\,ds \bigg| > \frac{\eta }{2} \bigg] \\ & \phantom{=:}+ \mathbb{P}\bigg[\sup _{0\leq s_{1}, s_{2} \leq t, |s_{1} - s_{2}| \leq \delta }\big|\tilde{W}_{\int _{0}^{s_{1}}h^{2} (\sigma _{ \frac{s}{\epsilon }}^{1,1} )\,ds} - \tilde{W}_{\int _{0}^{s_{2}}h^{2} ( \sigma _{\frac{s}{\epsilon }}^{1,1} )\,ds}\big| > \frac{\eta }{2} \bigg] \\ & \leq \mathbb{P}\bigg[\delta (r + M ) > \frac{\eta }{2} \bigg] \\ & \phantom{=:} + \mathbb{P}\bigg[\sup _{0\leq s_{3}, s_{4} \leq M^{2}t, |s_{3} - s_{4}| \leq M^{2}\delta } |\tilde{W}_{s_{3}} - \tilde{W}_{s_{4}} | > \frac{\eta }{2} \bigg] , \end{aligned}$$

since \(|\int _{a}^{b}h^{2}(\sigma _{\frac{s}{\epsilon }}^{1,1})\,ds| \leq M^{2}|a - b|\) for all \(a, b \in \mathbb{R}_{+}\). The first of the last two probabilities is clearly zero for \(\delta < \frac{\eta }{2(r + M)}\), while the second one can also be made arbitrarily small for small enough \(\delta \) since by a well-known result about the modulus of continuity of Brownian motion (see Mörters and Peres [22, Theorem 1.14]), the supremum within that probability converges almost surely (and so also in probability) to 0 as fast as \(M\sqrt{2\delta \ln \frac{1}{M^{2}\delta }}\). Using these in (4.6), we deduce that (4.5) is also satisfied and we have the desired tightness result, which implies that \((Y_{\cdot }^{1, \epsilon }, \dots , Y_{\cdot }^{m, \epsilon })\) converges in distribution to some limit \((Y_{\cdot }^{1, 0}, \dots , Y_{\cdot }^{m, 0})\) along some subsequence.

To conclude our proof, we need to show that \((Y^{1, 0}, \dots , Y^{m, 0})\) coincides with \((Y^{1, *}, \dots , Y^{m, *})\). But both \(m\)-dimensional processes are uniquely determined by their finite-dimensional distributions, and evaluation functionals on \(C([0, t]; \mathbb{R}^{m})\) preserve convergence in distribution (as continuous linear functionals). So we only need to show that for any fixed \((i_{1}, \dots , i_{\ell }) \in \{1, \dots , m\}^{\ell }\), any fixed \((t_{1}, \dots , t_{\ell }) \in (0, +\infty )^{\ell }\) and any fixed continuous bounded \(q: \mathbb{R}^{\ell } \rightarrow \mathbb{R}\), for an arbitrary \(\ell \in \mathbb{N}\), we have

$$\begin{aligned} & \mathbb{E} [q (Y_{t_{1}}^{i_{1}, \epsilon }, Y_{t_{2}}^{i_{2}, \epsilon }, \dots , Y_{t_{\ell }}^{i_{\ell }, \epsilon } ) ] \longrightarrow \mathbb{E} [q (Y_{t_{1}}^{i_{1}, *}, Y_{t_{2}}^{i_{2}, *}, \dots , Y_{t_{\ell }}^{i_{\ell }, *} ) ] \end{aligned}$$

as \(\epsilon \rightarrow 0{+}\). By dominated convergence, this follows if we can show that

$$\begin{aligned} & \lim _{\epsilon \rightarrow 0{+}}\mathbb{E} [q(Y_{t_{1}}^{i_{1}, \epsilon }, Y_{t_{2}}^{i_{2}, \epsilon }, \dots , Y_{t_{\ell }}^{i_{ \ell }, \epsilon }) \, | \,\sigma _{\cdot }^{i_{1}, 1}, \sigma _{ \cdot }^{i_{2}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1}, \mathcal{C} ] \\ & = \mathbb{E} [q (Y_{t_{1}}^{i_{1}, *}, Y_{t_{2}}^{i_{2}, *}, \dots , Y_{t_{\ell }}^{i_{\ell }, *} ) \, | \, \sigma _{\cdot }^{i_{1}, 1}, \sigma _{\cdot }^{i_{2}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1}, \mathcal{C} ] \end{aligned}$$

ℙ-almost surely. However, when the information contained in \(\sigma _{\cdot }^{i_{1}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1}\) and \(\mathcal{C}\) is given, both \((Y_{t_{1}}^{i_{1}, \epsilon }, \dots , Y_{t_{\ell }}^{i_{\ell }, \epsilon })\) and \((Y_{t_{1}}^{i_{1}, *}, \dots , Y_{t_{\ell }}^{i_{\ell }, *})\) follow a normal distribution on \(\mathbb{R}^{\ell }\). This means that given \((\sigma _{\cdot }^{i_{1}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1})\) and \(\mathcal{C}\), we only need to show that as \(\epsilon \rightarrow 0{+}\), the mean vector and the covariance matrix of \((Y_{t_{1}}^{i_{1}, \epsilon }, \dots , Y_{t_{\ell }}^{i_{\ell }, \epsilon })\) converge to the mean vector and the covariance matrix of \((Y_{t_{1}}^{i_{1}, *}, \dots , Y_{t_{\ell }}^{i_{\ell }, *})\), respectively. Given \((\sigma _{\cdot }^{i_{1}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1})\), \(\mathcal{C}\) and a \(k \in \{1, 2, \dots , \ell \}\), the \(k\)th coordinate of the mean vector of \((Y_{t_{1}}^{i_{1}, \epsilon }, Y_{t_{2}}^{i_{2}, \epsilon }, \dots , Y_{t_{ \ell }}^{i_{\ell }, \epsilon })\) is \(X_{0}^{i_{k}} + \int _{0}^{t_{k}}(r_{i_{k}}- \frac{h^{2}(\sigma _{\frac{s}{\epsilon }}^{i_{k}, 1})}{2})\,ds\), and by the positive recurrence property, this converges to \(X_{0}^{i_{k}} + (r_{i_{k}} - \frac{\sigma _{2,1}^{2}}{2})t_{k}\) as \(\epsilon \rightarrow 0{+}\) (since the volatility processes all have the same coefficients and thus the same stationary distributions), which is the \(k\)th coordinate of the mean vector of \((Y_{t_{1}}^{i_{1}, *}, Y_{t_{2}}^{i_{2}, *}, \dots , Y_{t_{\ell }}^{i_{ \ell }, *})\). Now we only need to obtain the corresponding convergence result for the covariance matrices. For some \(1 \leq p, q \leq \ell \), given \((\sigma _{\cdot }^{i_{1}, 1}, \sigma _{\cdot }^{i_{2}, 1}, \dots , \sigma _{\cdot }^{i_{\ell }, 1})\) and \(\mathcal{C}\), the covariance of \(Y_{t_{p}}^{i_{p}, \epsilon }\) and \(Y_{t_{q}}^{i_{q}, \epsilon }\) is equal to

$$\begin{aligned} & \left (\rho _{1, i_{p}}\rho _{1, i_{q}} + \delta _{i_{p}, i_{q}} \sqrt{1 - \rho _{1, i_{p}}}\sqrt{1 - \rho _{1, i_{q}}}\right )\int _{0}^{t_{p} \wedge t_{q}}h (\sigma _{\frac{s}{\epsilon }}^{i_{p}, 1} )h (\sigma _{ \frac{s}{\epsilon }}^{i_{q}, 1} )\,ds, \end{aligned}$$

while the covariance of \(Y_{t_{p}}^{i_{p}, *}\) and \(Y_{t_{q}}^{i_{q}, *}\) is equal to

$$\begin{aligned} & \left (\tilde{\rho }_{1,i_{p}}\tilde{\rho }_{1,i_{q}} + \delta _{i_{p}, i_{q}}\sqrt{1 - \tilde{\rho }_{1, i_{p}}^{2}}\sqrt{1 - \tilde{\rho }_{1, i_{q}}^{2}}\right ) \sigma _{2, 1}^{2} (t_{p} \wedge t_{q}). \end{aligned}$$

This means that for \(i_{p} = i_{q} = i \in \{1, 2, \dots , m\}\), we need to show that

$$\begin{aligned} & \int _{0}^{t_{p} \wedge t_{q}}h^{2} (\sigma _{\frac{s}{\epsilon }}^{i, 1} )\,ds \longrightarrow \sigma _{2, 1}^{2} (t_{p} \wedge t_{q}) \end{aligned}$$

as \(\epsilon \rightarrow 0{+}\), while for \(i_{p} \neq i_{q}\), we need to show that

$$\begin{aligned} & \rho _{1, i_{p}}\rho _{1, i_{q}} \int _{0}^{t_{p} \wedge t_{q}}h ( \sigma _{\frac{s}{\epsilon }}^{i_{p}, 1} )h (\sigma _{ \frac{s}{\epsilon }}^{i_{q}, 1} )\,ds \longrightarrow \tilde{\rho }_{1, i_{p}}\tilde{\rho }_{1, i_{q}} \sigma _{2, 1}^{2} (t_{p} \wedge t_{q}) \end{aligned}$$

as \(\epsilon \rightarrow 0{+}\), where \(\tilde{\rho }_{1, i}\sigma _{2, 1} = \rho _{1, i}\tilde{\sigma }\) for all \(i \leq m\). Both results follow from the positive recurrence property for \(\tilde{\sigma } = \sqrt{\mathbb{E} [h (\sigma ^{i_{p}, i_{q}, 1, *} )h (\sigma ^{i_{p}, i_{q}, 2, *} ) ]}\), which does not depend on \(i_{p}\) and \(i_{q}\) since the volatility processes all have the same coefficients and thus the same joint stationary distributions. This concludes the proof. □

Proof of Theorem 2.6

Let \(\mathbb{V}\) be the set of \({\mathbb{F}}^{W^{0}}\)-adapted, square-integrable semimartingales on \([0, T]\). Thus for any \((V_{t})_{0 \leq t \leq T} \in \mathbb{V}\), there exist two \({\mathbb{F}}^{W^{0}}\)-adapted and square-integrable processes \((V_{1,t})_{0 \leq t \leq T}\) and \((V_{2,t})_{0 \leq t \leq T}\) such that

$$\begin{aligned} V_{t} = V_{0} + \int _{0}^{t}v_{1,s}\,ds + \int _{0}^{t}v_{2,s}\,dW_{s}^{0} \end{aligned}$$
(4.7)

for \(0 \leq t \leq T\). The processes of the above form for which \((v_{1,t})_{0 \leq t \leq T}\) and \((v_{2,t})_{0 \leq t \leq T}\) are simple processes, that is,

$$\begin{aligned} v_{i,t} = F_{i}\mathbb{I}_{ [t_{1}, t_{2} ]}(t) \end{aligned}$$
(4.8)

for all \(0 \leq t \leq T\) and \(i \in \{1, 2\}\), with each \(F_{i}\) being \(\mathcal{F}_{t_{1}}^{W^{0}}\)-measurable, span a linear subspace \(\tilde{\mathbb{V}}\) which is dense in \(\mathbb{V}\) for the \(L^{2}\)-norm. By using the boundedness of \(h\) and then the estimate (2.3), for any \(p > 0\) and any \(T > 0\), we obtain

$$ \int _{0}^{T}\big\Vert h^{p} (\sigma _{\frac{t}{\epsilon }}^{1,1} )u^{ \epsilon }(t, \cdot )\big\Vert _{L_{\sigma , \mathcal{C}}^{2}( \mathbb{R}_{+}\times \Omega )}^{2}\,dt \leq TC^{2p}\left \Vert u_{0} \right \Vert _{L^{2}(\mathbb{R}_{+})}^{2}. $$
(4.9)

It follows that any sequence \(\epsilon _{n} \rightarrow 0{+}\) always has a subsequence \((\epsilon _{k_{n}})_{n \in \mathbb{N}}\) such that \(h^{p}(\sigma _{\frac{\cdot }{\epsilon }}^{1,1})u^{\epsilon _{k_{n}}}( \cdot , \cdot )\) converges weakly to some \(u_{p}(\cdot , \cdot )\) in \(L_{\sigma , \mathcal{C}}^{2}([0, T] \times \mathbb{R}_{+} \times \Omega )\) for \(p \in \{1, 2 \}\). Testing (2.2) against an arbitrary smooth and compactly supported function \(f : \mathbb{R}_{+} \to \mathbb{R}\), using Itô’s formula for the product of \(\int _{\mathbb{R}_{+}}u^{\epsilon }(\cdot , x)f(x)\,dx\) with a process \(V \in \tilde{\mathbb{V}}\) having the form (4.7), (4.8) and finally taking expectations, we find

$$\begin{aligned} & \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{t}\int _{\mathbb{R}_{+}}u^{ \epsilon }(t, x)f(x)\,dx\right ] \\ & = \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{0}\int _{\mathbb{R}_{+}}u_{0}(x)f(x) \,dx\right ] + r\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s} \int _{\mathbb{R}_{+}}u^{\epsilon }(s, x)f'(x)\,dx\right ]\,ds \\ & \phantom{=:} - \int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}} \bigg[V_{s}\int _{ \mathbb{R}_{+}}\frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2}u^{ \epsilon }(s, x)f'(x)\,dx\bigg]\,ds \\ & \phantom{=:} +\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\bigg[V_{s}\int _{ \mathbb{R}_{+}}\frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2} u^{ \epsilon }(s, x)f''(x)\,dx\bigg]\,ds \\ &\phantom{=:}+\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\bigg[v_{1,s}\int _{ \mathbb{R}_{+}} u^{\epsilon }(s, x)f(x)\,dx\bigg]\,ds \\ & \phantom{=:} + \rho _{1,1}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}} \bigg[v_{2,s} \int _{\mathbb{R}_{+}}h (\sigma _{\frac{s}{\epsilon }}^{1,1} )u^{ \epsilon }(s, x)f'(x)\,dx \bigg]\,ds \end{aligned}$$
(4.10)

for all \(t \leq T\). Upon setting \(\epsilon = \epsilon _{k_{n}}\) and taking \(n \rightarrow \infty \), the weak convergence results above yield

$$\begin{aligned} & \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{t}\int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right ] \\ & = \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{0}\int _{\mathbb{R}_{+}}u_{0}(x)f(x) \,dx\right ] + r\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s} \int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx\right ]\,ds \\ & \phantom{=:} - \frac{1}{2}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s} \int _{\mathbb{R}_{+}}u_{2}(s, x)f'(x)\,dx\right ]\,ds \\ & \phantom{=:} + \frac{1}{2}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s} \int _{\mathbb{R}_{+}} u_{2}(s, x)f''(x)\,dx\right ]\,ds \\ &\phantom{=:} +\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{1,s}\int _{ \mathbb{R}_{+}} u^{*}(s, x)f(x)\,dx\right ]\,ds \\ & \phantom{=:}+ \rho _{1,1}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{2,s} \int _{\mathbb{R}_{+}}u_{1}(s, x)f'(x)\,dx\right ]\,ds \end{aligned}$$
(4.11)

for all \(0 \leq t \leq T\). The convergence of the terms on the right-hand side of (4.10) holds pointwise in \(t\), while the term on the left-hand side converges weakly. Since we can easily find uniform bounds for all terms in (4.10) (by using (4.9)), dominated convergence implies that all the weak limits coincide with the corresponding pointwise limits, which gives (4.11) as a limit of (4.10) both weakly and pointwise in \(t\). It is clear then that \(\mathbb{E}_{\sigma , \mathcal{C}}[V_{t}\int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx]\) is differentiable in \(t\) (in a \(W^{1,1}\)-sense). Next, we can check that the expectation \(\mathbb{E}_{\sigma , \mathcal{C}}[v_{i,t}\int _{\mathbb{R}_{+}}u^{ \epsilon _{k_{n}}}(t, x)f(x)\,dx]\) converges to \(\mathbb{E}_{\sigma , \mathcal{C}}[v_{i,t}\int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx]\) for both \(i = 1\) and \(i = 2\), both weakly and pointwise in \(t \in [0, T]\), while the limits are also differentiable in \(t\) everywhere except in the two jump points \(t_{1}\) and \(t_{2}\). This follows because everything is zero outside \([t_{1}, t_{2}]\), while both \(v_{1}\) and \(v_{2}\) are constant in \(t\) and thus of the form (4.7), (4.8) if we restrict to that interval. Subtracting from each term of (4.10) the same term but with \(u^{\epsilon }\) replaced by \(u^{*}\) and then adding it back, we can rewrite this identity as

$$\begin{aligned} & \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{t}\int _{\mathbb{R}_{+}}u^{ \epsilon }(t, x)f(x)\,dx\right ] \\ & = \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{0}\int _{\mathbb{R}_{+}}u_{0}(x)f(x) \,dx\right ] + r\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s} \int _{\mathbb{R}_{+}}u^{\epsilon }(s, x)f'(x)\,dx\right ]\,ds \\ & \phantom{=:} - \int _{0}^{t}\frac{h^{2}(\sigma _{\frac{s}{\epsilon }}^{1,1})}{2} \bigg(\mathbb{E}_{\sigma , \mathcal{C}}\bigg[V_{s}\int _{\mathbb{R}_{+}}u^{ \epsilon }(s, x)f'(x)\,dx\bigg] \\ & \phantom{=:} \qquad \qquad \qquad \quad - \mathbb{E}_{\sigma , \mathcal{C}} \bigg[V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx\bigg]\bigg) \,ds \\ & \phantom{=:} - \int _{0}^{t}\frac{h^{2}(\sigma _{\frac{s}{\epsilon }}^{1,1})}{2} \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx\right ]\,ds \\ &\phantom{=:} + \int _{0}^{t}\frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2} \bigg(\mathbb{E}_{\sigma , \mathcal{C}}\bigg[V_{s}\int _{\mathbb{R}_{+}}u^{ \epsilon }(s, x)f''(x)\,dx\bigg] \\ & \phantom{=:} \qquad \qquad \qquad \quad - \mathbb{E}_{\sigma , \mathcal{C}} \bigg[V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f''(x)\,dx\bigg]\bigg) \,ds \\ & \phantom{=:} + \int _{0}^{t}\frac{h^{2} (\sigma _{\frac{s}{\epsilon }}^{1,1} )}{2} \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f''(x)\,dx\right ]\,ds \\ & \phantom{=:} +\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{1,s}\int _{ \mathbb{R}_{+}} u^{\epsilon }(s, x)f(x)\,dx\right ]\,ds \\ & \phantom{=:} + \rho _{1,1}\int _{0}^{t}h (\sigma _{\frac{s}{\epsilon }}^{1,1} ) \bigg(\mathbb{E}_{\sigma , \mathcal{C}}\bigg[v_{2,s}\int _{ \mathbb{R}_{+}}u^{\epsilon }(s, x)f'(x)\,dx\bigg] \\ & \phantom{=:} \qquad \qquad \qquad \qquad \,\, - \mathbb{E}_{\sigma , \mathcal{C}}\bigg[v_{2,s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx \bigg]\bigg)\,ds \\ &\phantom{=:} + \rho _{1,1}\int _{0}^{t}h (\sigma _{\frac{s}{\epsilon }}^{1,1} ) \mathbb{E}_{\sigma , \mathcal{C}}\left [v_{2,s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx\right ]\,ds. \end{aligned}$$
(4.12)

Then we have

$$\begin{aligned} & \bigg|\int _{0}^{t}h (\sigma _{\frac{s}{\epsilon }}^{1,1} )\bigg(\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{2,s}\int _{ \mathbb{R}_{+}}u^{\epsilon }(s, x)f'(x)\,dx\right ] \\ & \qquad \qquad \qquad - \mathbb{E}_{\sigma , \mathcal{C}}\bigg[v_{2,s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx \bigg]\bigg)\,ds \bigg| \\ & \leq C\int _{0}^{t}\bigg|\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{2,s} \int _{\mathbb{R}_{+}}u^{\epsilon }(s, x)f'(x)\,dx\right ] \\ & \qquad\quad \quad \! - \mathbb{E}_{\sigma , \mathcal{C}}\bigg[v_{2,s} \int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx\bigg]\bigg|\,ds, \end{aligned}$$

which tends to zero (when \(\epsilon = \epsilon _{k_{n}}\) and \(n \rightarrow \infty \)) by dominated convergence, since the quantity inside the last integral converges to zero pointwise and can be dominated by using (4.9). The same argument is used to show that the fourth and sixth terms in (4.12) also tend to zero along the same subsequence. Finally, for any term of the form

$$\begin{aligned} \int _{0}^{t}h^{p} (\sigma _{\frac{s}{\epsilon }}^{1,1} )\mathbb{E}_{ \sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f^{(m)}(x) \,dx\right ]\,ds \end{aligned}$$

for \(p, m \in \{0, 1, 2\}\), we recall the differentiability of the second factor inside the \(ds\)-integral (which was mentioned earlier) and then integrate by parts to write it as

$$\begin{aligned} & \int _{0}^{t}h^{p} (\sigma _{\frac{w}{\epsilon }}^{1,1} )\,dw \bigg(\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(t, x)f^{(m)}(x)\,dx\right ]\bigg) \\ & -\int _{0}^{t}\int _{0}^{s}h^{p} (\sigma _{\frac{w}{\epsilon }}^{1,1} )\,dw\bigg(\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{ \mathbb{R}_{+}}u^{*}(s, x)f^{(m)}(x)\,dx\right ]\bigg)'ds \end{aligned}$$

which converges by the positive recurrence property to the quantity

$$\begin{aligned} & t\mathbb{E} [h^{p} (\sigma ^{1,1,1,*} ) \, | \, \mathcal{C} ] \bigg( \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(t, x)f^{(m)}(x)\,dx\right ]\bigg) \\ & -\int _{0}^{t}s\mathbb{E} [h^{p} (\sigma ^{1,1,1,*} ) \, | \, \mathcal{C} ]\bigg(\mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{ \mathbb{R}_{+}}u^{*}(s, x)f^{(m)}(x)\,dx\right ]\bigg)'ds. \end{aligned}$$

Using integration by parts once more, this last expression is equal to

$$\begin{aligned} \mathbb{E} [h^{p} (\sigma ^{1,1,1,*} ) \, | \, \mathcal{C} ]\int _{0}^{t} \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f^{(m)}(x)\,dx\right ]\,ds. \end{aligned}$$

This last convergence result also holds if we replace \(V\) by \(v_{1}\) or \(v_{2}\), as we can show by following exactly the same steps in the subinterval \([t_{1}, t_{2}]\) (where \(v_{i}\) is supported for \(i \in \{1, 2\}\) and where we have differentiability that allows integration by parts).

If we now set \(\epsilon = \epsilon _{k_{n}}\) in (4.12), take \(n \rightarrow \infty \) and substitute all the above convergence results, we obtain

$$\begin{aligned} & \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{t}\int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right ] \\ & = \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{0}\int _{\mathbb{R}_{+}}u_{0}(x)f(x) \,dx\right ] \\ & \phantom{=:}+ \bigg(r-\frac{\sigma _{2,1}^{2}}{2}\bigg)\int _{0}^{t}\mathbb{E}_{ \sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x) \,dx\right ]\,ds \\ & \phantom{=:} + \frac{\sigma _{2,1}^{2}}{2}\int _{0}^{t} \mathbb{E}_{\sigma , \mathcal{C}}\left [V_{s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f''(x)\,dx \right ]\,ds \\ & \phantom{=:} +\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{1,s}\int _{ \mathbb{R}_{+}} u^{*}(s, x)f(x)\,dx\right ]\,ds \\ &\phantom{=:} +\rho _{1,1}\sigma _{1,1}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [v_{2,s}\int _{\mathbb{R}_{+}}u^{*}(s, x)f'(x)\,dx \right ]\,ds. \end{aligned}$$
(4.13)

Since \(\tilde{\mathbb{V}}\) is dense in \(\mathbb{V}\), for a fixed \(t \leq T\), we can have (4.13) for any square-integrable martingale \((V_{s})_{0 \leq s \leq t}\), for which we have \(v_{1,s} = 0\) for all \(0 \leq s \leq t\). Next, we denote by \(R_{u}(t, x)\) the right-hand side of (2.4). Using then Itô’s formula for the product of \(\int _{\mathbb{R}_{+}}R_{u}(s, x)f(x)\,dx\) with \(V_{s}\) at \(s = t\), subtracting \(V_{t}\int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\) from both sides, taking expectations and finally substituting from (4.13), we find that

$$\begin{aligned} \mathbb{E}_{\sigma , \mathcal{C}}\bigg[V_{t}\left (\int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx - \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right )\bigg] = 0 \end{aligned}$$

for our fixed \(t \leq T\). Using the martingale representation theorem, \(V_{s}\) can be taken equal to \(\mathbb{E}_{\sigma , \mathcal{C}} [\mathbb{I}_{\mathcal{E}_{s}} \, | \, \sigma (W_{s'}^{0}, s' \leq s ) ]\) for all \(s \leq t\), where we define

$$\begin{aligned} \mathcal{E}_{t} = \left \{ \omega \in \Omega : \int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx > \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right \} , \end{aligned}$$

and this implies \(V_{t} = \mathbb{I}_{\mathcal{E}_{t}}\), allowing us to write

$$\begin{aligned} \mathbb{E}_{\sigma , \mathcal{C}}\left [\mathbb{I}_{\mathcal{E}_{t}} \left (\int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx - \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right )\right ] = 0 \end{aligned}$$

for any \(0 \leq t \leq T\). If we integrate the above over \(t \in [0, T]\), we obtain that

$$\begin{aligned} \int _{0}^{T}\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\mathbb{I}_{ \mathcal{E}_{t}}\left (\int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx - \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\right )\bigg]\,dt = 0, \end{aligned}$$

where the quantity inside the expectation is always nonnegative and becomes zero only when \(\mathbb{I}_{\mathcal{E}_{t}} = 0\). This implies \(\int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx \leq \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\) almost everywhere, and working in the same way with the indicator of the complement \(\mathbb{I}_{\mathcal{E}_{t}^{c}}\), we can deduce the opposite inequality as well. Thus we must have \(\int _{\mathbb{R}_{+}}R_{u}(t, x)f(x)\,dx = \int _{\mathbb{R}_{+}}u^{*}(t, x)f(x)\,dx\) almost everywhere, and since the function \(f\) is an arbitrary smooth function with compact support, we can deduce that \(R_{u}\) coincides with \(u^{*}\) almost everywhere, which gives (2.4).

If \(h\) is bounded from below, we can use (2.3) to obtain a uniform (independent of \(\epsilon \)) bound for the \(H_{0}^{1}(\mathbb{R}_{+})\otimes L_{\sigma , \mathcal{C}}^{2}( \Omega \times [0, T])\)-norm of \(u^{\epsilon _{{n}}}\), which implies that along a further subsequence, the weak convergence to \(u^{*}\) also holds in that Sobolev space, in which (2.4) has a unique solution; see [5]. The proof is now complete. □

Proof of Proposition 2.7

The upper bound can be obtained by a simple Cauchy–Schwarz inequality, writing

$$\begin{aligned} \tilde{\sigma } = &\sqrt{\mathbb{E} [h (\sigma ^{1, 2, 1, *} )h ( \sigma ^{1, 2, 2, *} ) ]} \leq \sqrt{\sqrt{\mathbb{E} [h^{2} (\sigma ^{1, 2, 1, *} ) ]}\sqrt{\mathbb{E} [h^{2} (\sigma ^{1, 2, 2, *} ) ]}} = \sqrt{\sigma ^{2}_{2,1}} \\ =& \sigma _{2,1}. \end{aligned}$$

This calculation shows that this bound is only attainable when \(\sigma ^{i, j, 1, *} = \sigma ^{i, j, 2, *}\) for all \(i\) and \(j\) with \(i \neq j\), and this happens only when all the assets share a common stochastic volatility (i.e., \(\rho _{2} = 1\)).

For the lower bound, considering our volatility processes for \(i=1\) and \(i=2\) started from their one-dimensional stationary distributions independently, we have for any \(t, \epsilon \geq 0\) that

$$\begin{aligned} & \mathbb{E}\bigg[\frac{1}{t}\int _{0}^{t}h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} )h (\sigma _{\frac{s}{\epsilon }}^{2, 1} ) \,ds\bigg] \\ & = \frac{1}{t}\int _{0}^{t}\mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} )h (\sigma _{\frac{s}{\epsilon }}^{2, 1} ) ]\,ds \\ & = \frac{1}{t}\int _{0}^{t}\mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) ]\mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{2, 1} ) ]\,ds \\ & \phantom{=:}+ \frac{1}{t}\int _{0}^{t}\mathbb{E}\big[ \big(h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) - \mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) ] \big)\big(h (\sigma _{ \frac{s}{\epsilon }}^{2, 1} ) - \mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{2, 1} ) ]\big)\big]\,ds \\ & = \sigma _{1,1}^{2} + \frac{1}{t}\int _{0}^{t}\mathbb{E}\Big[ \mathbb{E}\big[\big(h (\sigma _{\frac{s}{\epsilon }}^{1, 1} ) - \mathbb{E} [h (\sigma _{\frac{s}{\epsilon }}^{1, 1} ) ]\big) \big(h ( \sigma _{\frac{s}{\epsilon }}^{2, 1} ) - \mathbb{E} [h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) ]\big) \, \big| \, B^{0}\big]\Big]\,ds \\ & = \sigma _{1,1}^{2} + \frac{1}{t}\int _{0}^{t}\mathbb{E}\Big[ \mathbb{E}\big[\big(h (\sigma _{\frac{s}{\epsilon }}^{1, 1} ) - \sigma _{1,1}\big) \, \big| \, B^{0}\big]^{2}\Big]\,ds \\ & \geq \sigma _{1,1}^{2}, \end{aligned}$$
(4.14)

since \(\sigma ^{1,1}\) and \(\sigma ^{2,1}\) are identically distributed, and also independent when \(B^{0}\) is given. Taking \(\epsilon \rightarrow 0{+}\) in (4.14) and using the positive recurrence property, the definition of \(\tilde{\sigma }\) and dominated convergence on the left-hand side (since the quantity inside the expectation there is bounded by the square of an upper bound of \(h\)), we obtain the lower bound, i.e., \(\tilde{\sigma } \geq \sigma _{1,1}\), which can also be shown to be unattainable in general. Indeed, if we choose \(h\) such that its composition \(\tilde{h}\) with the square function is strictly increasing and convex, and if \(g\) is chosen to be a square-root function (thus we are in the CIR volatility case), for any \(\alpha > 0\), we have

$$\begin{aligned} & \frac{1}{t}\int _{0}^{t}\mathbb{E}\Big[\mathbb{E}\big[\big(h ( \sigma _{\frac{s}{\epsilon }}^{1, 1} ) - \sigma _{1,1}\big) \, \big| \, B^{0}\big]^{2}\Big]\,ds \\ & = \mathbb{E}\bigg[\frac{1}{t}\int _{0}^{t}\bigg(\mathbb{E}\Big[ \tilde{h}\Big(\sqrt{\sigma _{\frac{s}{\epsilon }}^{1, 1}}\Big)\, \Big| \, B^{0}\Big]- \sigma _{1,1}\Bigg)^{2} \, ds\Bigg] \\ & \geq \alpha ^{2}\mathbb{E}\bigg[\frac{1}{t}\int _{0}^{t}\mathbb{I}_{ \{\sigma _{\frac{s}{\epsilon }}^{B^{0}, h} \geq \alpha + \sigma _{1,1} \}}\,ds\bigg], \end{aligned}$$

where \(\sigma _{s}^{B^{0}, h} := \mathbb{E}[\tilde{h}(\sqrt{\sigma _{s}^{1, 1}}) \, | \, B^{0}] \geq \tilde{h}(\sigma _{s}^{B^{0}})\) for \(\sigma _{s}^{B^{0}} := \mathbb{E}[\sqrt{\sigma _{s}^{1, 1}}\, | \, B^{0}]\), which implies that

$$ \frac{1}{t}\int _{0}^{t}\mathbb{E}\Big[\mathbb{E}\big[\big(h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) - \sigma _{1,1}\big) \, \big| \, B^{0} \big]^{2}\Big]\,ds \geq \alpha ^{2}\mathbb{E}\bigg[\frac{1}{t}\int _{0}^{t} \mathbb{I}_{\{\sigma _{\frac{s}{\epsilon }}^{B^{0}} \geq \tilde{h}^{-1} (\alpha + \sigma _{1,1} )\}}\,ds\bigg]. $$
(4.15)

Let \(\sigma ^{\rho }\) be the solution to the SDE

$$\begin{aligned} \sigma _{t}^{\rho } = \sigma _{0}^{B^{0}} + \frac{1}{2}\int _{0}^{t} \left (\kappa \theta - \frac{v^{2}}{4}\right ) \frac{1}{\sigma _{s}^{\rho }}\,ds + \frac{\kappa }{2}\int _{0}^{t} \sigma _{s}^{\rho }\,ds + \frac{\rho _{2}v}{2}B_{s}^{0}. \end{aligned}$$

Then \(\sigma ^{\rho }\) can be shown to be the square root of a CIR process having the same mean-reversion and vol-of-vol as \(\sigma ^{1,1}\) and a different stationary mean, and which satisfies the Feller condition for not hitting zero at a finite time. If we have \(\sigma _{t_{1}}^{\rho } > \sigma _{t_{1}}^{B^{0}}\) for some \(t_{1} > 0\), we consider \(t_{0} = \sup \{s \leq t_{1}: \sigma _{s}^{\rho } = \sigma _{s}^{B^{0}} \}\) which is obviously nonnegative. Then since \(\mathbb{E}[{\frac{1}{\sqrt{\sigma _{s}^{1,1}}}}\, | \, B^{0}] \geq \frac{1}{\mathbb{E}[\sqrt{\sigma _{s}^{1,1}}\, | \, B^{0}]} = \frac{1}{\sigma _{s}^{B^{0}}}\), we have

$$\begin{aligned} \sigma _{t_{1}}^{B^{0}} =& \sigma _{t_{0}}^{B^{0}} + \frac{1}{2} \int _{t_{0}}^{t_{1}}\left (\kappa \theta - \frac{v^{2}}{4}\right ) \mathbb{E}\Bigg[{\frac{1}{\sqrt{\sigma _{s}^{1,1}}}}\, \Bigg| \, B^{0} \Bigg]\,ds \\ & - \frac{\kappa }{2}\int _{t_{0}}^{t_{1}}\sigma _{s}^{B^{0}}\,ds + \frac{\rho _{2}v}{2} (B_{t_{1}}^{0} - B_{t_{0}}^{0} ) \\ \geq & \sigma _{t_{0}}^{B^{0}} + \frac{1}{2}\int _{t_{0}}^{t_{1}} \left (\kappa \theta - \frac{v^{2}}{4}\right ) \frac{1}{\sigma _{s}^{B^{0}}}\,ds - \frac{\kappa }{2}\int _{t_{0}}^{t_{1}} \sigma _{s}^{B^{0}}\,ds + \frac{\rho _{2}v}{2} (B_{t_{1}}^{0} - B_{t_{0}}^{0} ) \\ \geq & \sigma _{t_{0}}^{\rho } + \frac{1}{2}\int _{t_{0}}^{t_{1}} \left (\kappa \theta - \frac{v^{2}}{4}\right ) \frac{1}{\sigma _{s}^{\rho }}\,ds - \frac{\kappa }{2}\int _{t_{0}}^{t_{1}} \sigma _{s}^{\rho }\,ds + \frac{\rho _{2}v}{2} (B_{t_{1}}^{0} - B_{t_{0}}^{0} ) \\ =& \sigma _{t_{1}}^{\rho } \end{aligned}$$

which is a contradiction. Thus \(\sigma _{s}^{\rho } \leq \sigma _{s}^{B^{0}}\) for all \(s \geq 0\), and this gives in (4.15) that

$$\frac{1}{t}\int _{0}^{t}\mathbb{E}\Big[\mathbb{E}\big[\big(h (\sigma _{ \frac{s}{\epsilon }}^{1, 1} ) - \sigma _{1,1}\big) \, \big| \, B^{0} \big]^{2}\Big]\,ds \geq \alpha ^{2}\mathbb{E}\left [\frac{1}{t}\int _{0}^{t} \mathbb{I}_{\{\sigma _{\frac{s}{\epsilon }}^{\rho } \geq \tilde{h}^{-1} (\alpha + \sigma _{1,1} )\}}\,ds\right ]. $$

By the positive recurrence of \(\sigma ^{\rho }\) (which is the root of a CIR process, the ergodicity of which can be deduced from Proposition 2.3), the right-hand side of the above converges to \(\alpha ^{2}\mathbb{P}[\sigma ^{\rho , *} \geq \tilde{h}^{-1}(\alpha + \sigma _{1,1})]\) as \(\epsilon \rightarrow 0{+}\), where \(\sigma ^{\rho , *}\) has the stationary distribution of \(\sigma ^{\rho }\). This expression can only be zero when \(\sigma ^{\rho , *}\) is a constant, and since the square of \(\sigma ^{\rho }\) satisfies Feller’s boundary condition, this can only happen when \(\rho _{2} = 0\). In that case, we can easily check that \(\sigma ^{1, 2, 1, *}\) and \(\sigma ^{1, 2, 1, *}\) are independent, which implies that \(\tilde{\sigma } = \sigma _{1,1}\). This completes the proof. □

Proof of Corollary 2.8

Let us suppose that \(\mathbb{P}[X_{t}^{1, \epsilon } \in \mathcal{I} \, | \, W_{\cdot }^{0}, B_{\cdot }^{0}, \mathcal{G}]\) converges to \(\mathbb{P}[X_{t}^{1, *} \in \mathcal{I} \, | \, W_{\cdot }^{0}, \mathcal{G}]\) in probability, under the assumptions of both Theorems 2.4 and 2.6. The same convergence then holds in a strong \(L^{2}\)-sense for some sequence \(\epsilon _{n} \downarrow 0\), since it will hold ℙ-a.s. for some sequence and then we can apply dominated convergence. Therefore, the same convergence must hold weakly in \(L^{2}\) as well. However, assuming for simplicity that \((r_{i}, \rho _{1,i})\) is also a constant vector \((r, \rho _{1})\) for all \(i\) and fixing a sufficiently integrable and \(\sigma (W_{\cdot }^{0}, B_{\cdot }^{0}) \cap \mathcal{G}\)-measurable random variable \(\Xi \), by Theorem 2.6, we have

$$\begin{aligned} \lim _{n \rightarrow \infty }\mathbb{E}\big[\Xi \mathbb{P} [X_{t}^{1, \epsilon _{n}} \in \mathcal{I} \, | \, W^{0}, B^{0}, \mathcal{G} ] \big] & = \lim _{n \rightarrow \infty }\mathbb{E}\big[\Xi \mathbb{P} [X_{t}^{1, \epsilon _{n}} \in \mathcal{I} \, | \, W^{0}, \sigma ^{1,1}, \mathcal{G} ] \big] \\ & = \lim _{n \rightarrow \infty }\mathbb{E}\left [\int _{0}^{+ \infty }\Xi \mathbb{I}_{\mathcal{I}}(x)u^{\epsilon _{n}}(t, x)\,dx \right ] \\ & = \mathbb{E}\left [\int _{0}^{+\infty }\Xi \mathbb{I}_{\mathcal{I}}(x)u^{*}(t, x)\,dx\right ] \\ & = \mathbb{E}\big[\Xi \mathbb{P} [X_{t}^{1, w} \in \mathcal{I} \, | \, W^{0}, \mathcal{G} ]\big], \end{aligned}$$

where for each \(i\) we define

$$\begin{aligned} X_{t}^{i, w} &= x^{i} + \bigg(r - \frac{\sigma _{2,1}^{2}}{2} \bigg)t + \rho _{1}'\sigma _{2,1}W_{t}^{0} + \sqrt{1 - (\rho _{1}' )^{2}} \sigma _{2,1}W_{t}^{i}, \qquad 0 \leq t \leq T_{i}^{w}, \\ X_{t}^{i, w} &= 0,\qquad t \geq T_{i}^{w} := \inf \{ t\geq 0: X_{t}^{i, w}=0\} \end{aligned}$$

with \(\rho _{1}' = \rho _{1}\frac{\sigma _{1,1}}{\sigma _{2,1}}\), in which the density of \(X_{t}^{i, w}\) given \(W^{0}\) and \(\mathcal{G}\) is the unique solution \(u^{*}\) to (2.4); see [18]. Therefore, by the uniqueness of a weak limit, we must have \(\mathbb{P}[X_{t}^{1, *} \in \mathcal{I} \, | \, W^{0}, \mathcal{G}] = \mathbb{P}[X_{t}^{1, w} \in \mathcal{I} \, | \, W^{0}, \mathcal{G}]\) ℙ-almost surely, which cannot be true for any interval ℐ as otherwise the processes \(X_{\cdot }^{1, w}\) and \(X_{\cdot }^{1, *}\) would coincide, which is clearly not the case here. Indeed, this can only be true when \(\tilde{\rho }_{1,1} = \rho _{1}'\), which is equivalent to \(\tilde{\sigma } = \sigma _{1,1}\), and by Proposition 2.7, this is generally not the case unless \(\rho _{2} = 0\). □

5 Proofs: small vol-of-vol setting

We now proceed to the proofs of Proposition 3.1, Theorem 3.2 and Corollary 3.3, the main results of Sect. 3.

Proof of Proposition 3.1

First, we show that each volatility process has a finite \(2p\)-moment for any \(p \in \mathbb{N}\). Indeed, we fix a \(p \in \mathbb{N}\) and consider the sequence of stopping times \((\tau _{n, \epsilon })_{n \in \mathbb{N}}\), where \(\tau _{n, \epsilon } = \inf \{t \geq 0: \sigma _{t}^{i, \epsilon } > n \}\). With \(\sigma _{t}^{i, n, \epsilon } = \sigma _{t \wedge \tau _{n, \epsilon }}^{i, \epsilon }\), Itô’s formula gives

$$\begin{aligned} (\sigma _{t}^{i, n, \epsilon } - \theta _{i} )^{2p} =& (\sigma _{0}^{i, n, \epsilon } - \theta _{i} )^{2p} - \frac{2p\kappa _{i}}{\epsilon } \int _{0}^{t}\mathbb{I}_{ [0, \tau _{n, \epsilon } ]}(s) (\sigma _{s}^{i, n, \epsilon } - \theta _{i} )^{2p}\,ds \\ & + 2p\xi _{i}\int _{0}^{t}\mathbb{I}_{ [0, \tau _{n, \epsilon } ]}(s) (\sigma _{s}^{i, n, \epsilon } - \theta _{i} )^{2p-1}g (\sigma _{s}^{i, n, \epsilon } )\,d\tilde{B}_{s}^{i} \\ & + p(2p-1)\xi _{i}^{2}\int _{0}^{t}\mathbb{I}_{ [0, \tau _{n, \epsilon } ]}(s) (\sigma _{s}^{i, n, \epsilon } - \theta _{i} )^{2p-2}g^{2} (\sigma _{s}^{i, n, \epsilon } )\,ds \qquad \quad \end{aligned}$$

for \(\tilde{B}^{i} = \sqrt{1-\rho _{2,i}^{2}}B^{i}+\rho _{2,i}B^{0}\), where the stochastic integral is a martingale. Taking expectations, setting \(f(t, n, p, \epsilon ) = \mathbb{E}[\mathbb{I}_{ [0, \tau _{n, \epsilon } ]}(t)(\sigma _{t}^{i, n, \epsilon } - \theta _{i})^{2p}]\) and using the growth condition \(|g(x)| \leq C_{1,g} + C_{2,g}|x|\) for all \(x \in \mathbb{R}\), the condition \(\kappa _{i} > c_{\kappa }\) and a few simple inequalities, we easily obtain

$$ f(t, n, p, \epsilon ) \leq M + \bigg(M' - \frac{2pc_{\kappa }}{\epsilon }\bigg)\int _{0}^{t}f(s, n, p, \epsilon )\,ds $$

with \(M, M'\) depending only on \(p\), \(c_{g}\) and the bounds of \(\sigma ^{i}, \xi _{i}, \theta _{i}\). Thus, using Gronwall’s inequality, we get

$$\begin{aligned} \int _{0}^{t}f(s, n, p, \epsilon )\,ds \leq & M\int _{0}^{t}e^{ (M' - \frac{2pc_{\kappa }}{\epsilon } )(t-s)}\,ds \\ \leq & Me^{M't}\int _{0}^{t}e^{ - \frac{2pc_{\kappa }}{\epsilon } (t-s)} \,ds \\ =& \frac{\epsilon }{2pc_{\kappa }}e^{M't} (1 - e^{ - \frac{2pc_{\kappa }}{\epsilon } t} ) \\ < & \frac{\epsilon }{2pc_{\kappa }}e^{M't}, \end{aligned}$$

and using then Fatou’s lemma on the left-hand side of the above, we obtain

$$ \int _{0}^{t}f(s, p, \epsilon )\,ds < \frac{\epsilon }{2pc_{\kappa }}e^{M't}, $$

where \(f(t, p, \epsilon ) := \mathbb{E} [ (\sigma _{t}^{i, \epsilon } - \theta _{i} )^{2p} ]\). The desired result now follows. □

Proof of Theorem 3.2

We can easily check that \(v^{0, \epsilon }\) and \(v^{0}\) are the unique solutions to the SPDEs (3.1) and (3.2), respectively, in \(L^{2}(\Omega \times [0, T] ; H^{2}(\mathbb{R}_{+}))\), with the boundary conditions \(v_{x}^{0, \epsilon }(t, 0) = 0\) and \(v_{x}^{0}(t, 0) = 0\), respectively. Subtracting the SPDEs satisfied by \(v^{0, \epsilon }\) and \(v^{0}\) and setting \(v^{d, \epsilon } = v^{0} - v^{0, \epsilon }\), we easily verify that

$$\begin{aligned} v^{d,\epsilon } (t, x ) = &-\frac{1}{2}\int _{0}^{t}\big(h^{2} ( \sigma _{s}^{1, \epsilon } ) - h^{2} (\theta _{1} ) \big)v_{x}^{0, \epsilon } (s, x )\,ds \\ & + \int _{0}^{t}\left (r - \frac{h^{2}\left (\theta _{1}\right )}{2} \right )v_{x}^{d, \epsilon } (s, x )\,ds \\ & + \frac{1}{2}\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} (\theta _{1} ) \big)v_{xx}^{0, \epsilon } (s, x )\,ds + \int _{0}^{t} \frac{h^{2} (\theta _{1} )}{2}v_{xx}^{d, \epsilon } (s, x )\,ds \\ & + \rho _{1,1}\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h ( \theta _{1} ) \big)v_{x}^{0, \epsilon } (s, x )\,dW_{s}^{0} \\ & + \rho _{1,1}\int _{0}^{t}h (\theta _{1} )v_{x}^{d, \epsilon } (s, x )\,dW_{s}^{0}. \end{aligned}$$

Now using Itô’s formula for the \(L^{2}\)-norm (see Krylov and Rozovskii [20]), given the volatility path and \(\mathcal{C}\), we obtain

$$\begin{aligned} &\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{\mathbb{R}_{+}}\big(v^{d, \epsilon } (s, x )\big)^{2} \, dx\bigg] \\ & = -\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big)\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{ \mathbb{R}_{+}}v_{x}^{0, \epsilon } (s, x )v^{d, \epsilon } (s, x )\,dx \bigg]\,ds \\ & \phantom{=:}+ 2\bigg(r - \frac{h^{2} (\theta _{1} )}{2} \bigg)\int _{0}^{t} \mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{\mathbb{R}_{+}}v_{x}^{d, \epsilon } (s, x )v^{d, \epsilon } (s, x )\,dx\bigg]\,ds \\ & \phantom{=:} - \int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big) \mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{ \mathbb{R}_{+}}v_{x}^{0, \epsilon } (s, x )v_{x}^{d, \epsilon } (s, x ) \,dx\bigg]\,ds \\ &\phantom{=:} - \int _{0}^{t}h^{2} (\theta _{1} )\mathbb{E}_{\sigma , \mathcal{C}} \bigg[\int _{\mathbb{R}_{+}}\big(v_{x}^{d, \epsilon } (s, x )\big)^{2} \, dx\bigg]\,ds \\ &\phantom{=:} + \rho _{1,1}^{2}\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2}\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{ \mathbb{R}_{+}}\big(v_{x}^{0, \epsilon } (s, x )\big)^{2} \, dx\bigg] \,ds \\ &\phantom{=:}+ 2\rho _{1,1}^{2}h (\theta _{1} )\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big) \mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{\mathbb{R}_{+}}v_{x}^{0, \epsilon } (s, x )v_{x}^{d, \epsilon } (s, x )\,dx\bigg]\,ds \\ & \phantom{=:}+ \rho _{1,1}^{2}\int _{0}^{t}h^{2} (\theta _{1} )\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{\mathbb{R}_{+}}\big(v_{x}^{d, \epsilon } (s, x )\big)^{2} \, dx\bigg]\,ds + N(t, \epsilon ), \end{aligned}$$
(5.1)

where \(N(t, \epsilon )\) is some noise due to the correlation between \(B^{0}\) and \(W^{0}\), satisfying \(\mathbb{E}[N(t, \epsilon )] = 0\). In particular, since we could have written \(W^{0} = \sqrt{1 - \rho _{3}^{2}}V^{0} + \rho _{3}B^{0}\) for some Brownian motion \(V^{0}\) independent from \(B^{0}\), we have

$$ N(t, \epsilon ) = 2\rho _{1,1}\int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\bigg[h (\sigma _{s}^{1, \epsilon } )\int _{\mathbb{R}_{+}}v^{d, \epsilon } (s, x )v_{x}^{0, \epsilon } (s, x )\,dx\bigg]\,dB_{s}^{0}. $$

Next, we can apply part 2 of Theorem 4.1 in [15] to the SPDE (3.1) to find

$$ \Vert v_{x}^{0, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )} = \Vert u^{\epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )} \leq \Vert u_{0}(\cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )} $$

for all \(s \geq 0\). Using this expression, we obtain the estimate

$$\begin{aligned} &\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big)\mathbb{E}_{\sigma , \mathcal{C}}\bigg[\int _{ \mathbb{R}_{+}}v_{x}^{0, \epsilon }\left (s, x\right )v^{d, \epsilon } (s, x )\,dx\bigg]\,ds \\ & \leq \int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big) \Vert v_{x}^{0, \epsilon }(s, \cdot ) \Vert _{L_{ \sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}\,ds \\ & \leq \Vert u_{0}(\cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}\sqrt{\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big)^{2} \, ds} \sqrt{\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds} \\ & \leq \frac{1}{2} \Vert u_{0}(\cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2}\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} (\theta _{1} ) \big)^{2} \, ds \\ &\phantom{=}+ \frac{1}{2}\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{ \sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2}\,ds, \end{aligned}$$
(5.2)

and in the same way

$$\begin{aligned} &\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big)\mathbb{E}_{\sigma , \mathcal{C}}\left [\int _{ \mathbb{R}_{+}}v_{x}^{0, \epsilon } (s, x )v_{x}^{d, \epsilon } (s, x ) \,dx\right ]\,ds \\ & \leq \Vert u_{0}(\cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}\sqrt{\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} ( \theta _{1} ) \big)^{2} \, ds} \sqrt{\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds} \\ & \leq \frac{1}{2\eta } \Vert u_{0}(\cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2}\int _{0}^{t}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} (\theta _{1} ) \big)^{2} \, ds \\ &\phantom{=:}+ \frac{\eta }{2}\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \end{aligned}$$
(5.3)

and

$$\begin{aligned} &\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)\mathbb{E}_{\sigma , \mathcal{C}}\left [\int _{\mathbb{R}_{+}}v_{x}^{0, \epsilon }\left (s, x\right )v_{x}^{d, \epsilon } (s, x )\,dx\right ] \,ds \\ & \leq \left \Vert u_{0}(\cdot )\right \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}\sqrt{\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2} \, ds} \sqrt{\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds} \\ & \leq \frac{1}{2\eta }\left \Vert u_{0}(\cdot )\right \Vert _{L^{2} ( \Omega \times \mathbb{R}_{+} )}^{2}\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2} \, ds \\ & \phantom{=:} + \frac{\eta }{2}\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \end{aligned}$$
(5.4)

for some \(\eta > 0\). Moreover, we have the estimate

$$\begin{aligned} & \int _{0}^{t}\mathbb{E}_{\sigma , \mathcal{C}}\left [\int _{ \mathbb{R}_{+}}v_{x}^{d, \epsilon } (s, x )v^{d, \epsilon } (s, x )\,dx \right ]\,ds \\ & \leq \int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{ \sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}\,ds \\ & \leq \left (\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{ \sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \right )^{1/2}\left (\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds\right )^{1/2} \\ & \leq \frac{1}{2\eta }\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds + \frac{\eta }{2}\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds , \end{aligned}$$
(5.5)

and by using \(\Vert v_{x}^{0, \epsilon }(s, \cdot )\Vert _{L_{\sigma , \mathcal{C}}^{2} \left (\Omega \times \mathbb{R}_{+}\right )} \leq \Vert u_{0}(\cdot ) \Vert _{L^{2}(\Omega \times \mathbb{R}_{+})}\) again, we also obtain

$$\begin{aligned} & \int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2}\mathbb{E}_{\sigma , \mathcal{C}}\left [\int _{\mathbb{R}_{+}} \big(v_{x}^{0, \epsilon } (s, x )\big)^{2} \, dx\right ]\,ds \\ & \qquad \leq \left \Vert u_{0}(\cdot )\right \Vert _{L^{2}\left ( \Omega \times \mathbb{R}_{+}\right )}^{2}\int _{0}^{t}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2} \, ds. \end{aligned}$$
(5.6)

Using (5.2)–(5.6) in (5.1) and then taking \(\eta \) sufficiently small, we get the estimate

$$\begin{aligned} & \Vert v^{d, \epsilon }(t, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} + m\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L_{\sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \\ & \leq M\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L_{ \sigma , \mathcal{C}}^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds + N(t, \epsilon ) + MH(\epsilon ) \end{aligned}$$
(5.7)

for all \(t \in [0, T]\), where

$$ H(\epsilon ) = \int _{0}^{T}\big(h^{2} (\sigma _{s}^{1, \epsilon } ) - h^{2} (\theta _{1} ) \big)^{2} \, ds + \int _{0}^{T}\big(h (\sigma _{s}^{1, \epsilon } ) - h (\theta _{1} ) \big)^{2} \, ds $$

and \(M, m > 0\) are constants independent of the fixed volatility path. Taking expectations in (5.7) to average over all volatility paths, we find that

$$\begin{aligned} & \Vert v^{d, \epsilon }(t, \cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2} + m\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \\ & \leq M\int _{0}^{t} \Vert v^{d, \epsilon }(s, \cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds + M\mathbb{E} [H(\epsilon ) ], \end{aligned}$$

and using Gronwall’s inequality on the above, we finally obtain

$$\begin{aligned} & \Vert v^{d, \epsilon }(t, \cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2} + m\int _{0}^{t} \Vert v_{x}^{d, \epsilon }(s, \cdot ) \Vert _{L^{2} (\Omega \times \mathbb{R}_{+} )}^{2} \, ds \leq M'\mathbb{E} [H(\epsilon ) ] \end{aligned}$$

for some \(M' > 0\), with \(\mathbb{E}[H(\epsilon )] = \mathcal{O}(\epsilon )\) as \(\epsilon \rightarrow 0{+}\). This last result follows since for \(\tilde{h} \in \{h, h^{2}\}\), we can use the mean-value theorem to find that

$$\begin{aligned} \int _{0}^{T}\big(\tilde{h} (\sigma _{s}^{1, \epsilon } ) - \tilde{h} ( \theta _{1} ) \big)^{2} \, ds =& \int _{0}^{T}\tilde{h}' (\sigma _{s,*}^{1, \epsilon } ) (\sigma _{s}^{1, \epsilon } - \theta _{1} )^{2} \, ds \end{aligned}$$
(5.8)

for some \(\sigma _{s,*}^{1, \epsilon }\) lying between \(\theta _{1}\) and \(\sigma _{s}^{1, \epsilon }\), with

$$\begin{aligned} |\tilde{h}'(\sigma _{s,*}^{1, \epsilon }) | \leq \lambda _{1} + \lambda _{2} |\sigma _{s,*}^{1, \epsilon } |^{m} \leq & \lambda _{1} + \lambda _{2} ( |\sigma _{s}^{1, \epsilon } | + |\theta _{1} | )^{m} \\ \leq & \lambda _{1} + \lambda _{2} ( |\sigma _{s}^{1, \epsilon } - \theta _{1} | + 2 |\theta _{1} | )^{m} \end{aligned}$$

for some \(\lambda _{1}, \lambda _{2} > 0\) and some \(m \in \mathbb{N}\), which allows us to bound the right-hand side of (5.8) by a linear combination of terms of the form \(\Vert \sigma _{\cdot }^{1, \epsilon } - \theta _{1}\Vert _{L^{p}( \Omega \times [0, T])}^{p}\) which are all \(\mathcal{O}(\epsilon )\) as \(\epsilon \to 0+\) by Proposition 3.1. The proof of the theorem is now complete. □

Proof of Corollary 3.3

Let \(\mathcal{E}_{t, \epsilon } = \{\omega \in \Omega : \mathbb{P}[X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G}] > x\}\) for \(\epsilon > 0\), \(\mathcal{E}_{t, 0} = \{\omega \in \Omega : \mathbb{P}[X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G}] > x\}\) and observe that

$$\begin{aligned} E(x, T) =& \int _{0}^{T} |\mathbb{P} [\mathcal{E}_{t, \epsilon } ] - \mathbb{P} [\mathcal{E}_{t, 0} ] |\,dt, \\ =& \int _{0}^{T} |\mathbb{P} [\mathcal{E}_{t, \epsilon } \cap \mathcal{E}_{t, 0}^{c} ] - \mathbb{P} [\mathcal{E}_{t, 0} \cap \mathcal{E}_{t, \epsilon }^{c} ] |\,dt, \\ \leq & \int _{0}^{T}\mathbb{P} [\mathcal{E}_{t, \epsilon } \cap \mathcal{E}_{t, 0}^{c} ]\,dt + \int _{0}^{T}\mathbb{P} [\mathcal{E}_{t, 0} \cap \mathcal{E}_{t, \epsilon }^{c} ]\,dt. \end{aligned}$$
(5.9)

Next, for any \(\eta > 0\), we have

$$\begin{aligned} & \mathbb{P} [\mathcal{E}_{t, \epsilon } \cap \mathcal{E}_{t, 0}^{c} ] \\ & = \mathbb{P}\big[\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] > x \geq \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ]\big] \\ & = \mathbb{P}\big[\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] > x > x - \eta > \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ]\big] \\ & \phantom{=:} + \mathbb{P}\big[\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] > x \geq \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] \geq x - \eta \big] \\ & \leq \mathbb{P}\big[\big|\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ]\big| > \eta \big] \\ & \phantom{=:} + \mathbb{P}\big[x \geq \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] \geq x - \eta \big] \\ & \leq \frac{1}{\eta ^{2}}\mathbb{E}\big[ (\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] )^{2} \big] \\ &\phantom{=:}+ \mathbb{P}\big[x \geq \mathbb{P} [X_{t}^{1, *} \geq 0 \, | \, W^{0}, \mathcal{G} ] \geq x - \eta \big], \end{aligned}$$
(5.10)

and if \(\mathcal{S}\) denotes the \(\sigma \)-algebra generated by the volatility paths, since \(X_{t}^{1, *}\) is independent of \(\mathcal{S}\) and the path of \(B^{0}\), using the Cauchy–Schwarz inequality gives

$$\begin{aligned} & \int _{0}^{T}\mathbb{E}\big[ (\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, B^{0}, \mathcal{G} ] - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] )^{2} \big]\,dt \\ & = \int _{0}^{T}\mathbb{E}\Big[\mathbb{E}\big[\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, C_{1}', \mathcal{S}, \mathcal{G} ] \\ & \phantom{=} \qquad \quad - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, C_{1}', \mathcal{G} ] \, \big| \, W^{0}, B^{0}, \mathcal{G} \big]^{2} \Big]\,dt \\ & \leq \int _{0}^{T}\mathbb{E}\Big[\mathbb{E}\big[ (\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, C_{1}', \mathcal{S}, \mathcal{G} ] \\ & \phantom{=}\qquad \quad - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, C_{1}', \mathcal{G} ] )^{2} \, \big| \, W^{0}, B^{0}, \mathcal{G} \big] \Big] \,dt \\ & = \int _{0}^{T}\mathbb{E}\big[ (\mathbb{P} [X_{t}^{1, \epsilon } > 0 \, | \, W^{0}, C_{1}', \mathcal{S}, \mathcal{G} ] - \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, C_{1}', \mathcal{G} ] )^{2} \big]\,dt \\ & = \Vert v^{0, \epsilon } (\cdot , 0 ) - v^{0} (\cdot , 0 ) \Vert _{L^{2} (\Omega \times [0, T] )}^{2} \\ & = \mathcal{O} (\epsilon ), \end{aligned}$$

where the last equality above follows by using Morrey’s inequality in dimension 1 (see Evans [10, Sect. 5.6.2, Theorem 4]) and Theorem 3.2. On the other hand, since \(\mathbb{P}[X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G}]\) has a bounded density near \(x\), uniformly in \(t \in [0, T]\), we have

$$\begin{aligned} \int _{0}^{T}\mathbb{P}\big[x \geq \mathbb{P} [X_{t}^{1, *} > 0 \, | \, W^{0}, \mathcal{G} ] \geq x - \eta \big]\,dt = \mathcal{O} (\eta ). \end{aligned}$$

Therefore (5.10) gives \(\int _{0}^{T}\mathbb{P}[\mathcal{E}_{t, \epsilon } \cap \mathcal{E}_{t, 0}^{c}]\,dt \leq \frac{1}{\eta ^{2}}\mathcal{O}(\epsilon ) + \mathcal{O}(\eta ) \) for any \(\eta > 0\), and in a similar way we can obtain \(\int _{0}^{T}\mathbb{P}[\mathcal{E}_{t, 0} \cap \mathcal{E}_{t, \epsilon }^{c}]\,dt \leq \frac{1}{\eta ^{2}}\mathcal{O}(\epsilon ) + \mathcal{O}(\eta )\). Using these two expressions in (5.9) and taking \(\eta = \epsilon ^{p}\) for some \(p > 0\), we finally obtain

$$\begin{aligned} E(x, T) \leq \mathcal{O} (\epsilon ^{p} ) + \mathcal{O} (\epsilon ^{1 - 2p} ), \end{aligned}$$

which becomes optimal as \(\epsilon \rightarrow 0{+}\) when \(1 - 2p = p\), i.e., for \(p = \frac{1}{3}\). This gives \(E(x, T) = \mathcal{O}(\epsilon ^{\frac{1}{3}})\) as \(\epsilon \rightarrow 0{+}\). □