Asymptotics for volatility derivatives in multi-factor rough volatility models

Lacombe, Chloe; Muguruza, Aitor; Stone, Henry

doi:10.1007/s11579-020-00288-5

Asymptotics for volatility derivatives in multi-factor rough volatility models

Open access
Published: 09 January 2021

Volume 15, pages 545–577, (2021)
Cite this article

Download PDF

You have full access to this open access article

Mathematics and Financial Economics Aims and scope Submit manuscript

Asymptotics for volatility derivatives in multi-factor rough volatility models

Download PDF

2474 Accesses
4 Citations
Explore all metrics

Abstract

We study the small-time implied volatility smile for Realised Variance options, and investigate the effect of correlation in multi-factor models on the linearity of the smile. We also develop an approximation scheme for the Realised Variance density, allowing fast and accurate pricing of Volatility Swaps. Additionally, we establish small-noise asymptotic behaviour of a general class of VIX options in the large strike regime.

The characteristic function of Gaussian stochastic volatility models: an analytic expression

Article 16 September 2022

Option pricing under fast-varying and rough stochastic volatility

Article 08 June 2018

Second order multiscale stochastic volatility asymptotics: stochastic terminal layer analysis and calibration

Article 20 April 2016

1 Introduction

Following the works by Alòs et al. [2], Gatheral et al. [24] and Bayer et al. [7], rough volatility is becoming a new breed in financial modelling by generalising Bergomi’s ‘second generation’ stochastic volatility models to a non-Markovian setting. The most basic form of (lognormal) rough volatility model is the so-called rough Bergomi model introduced in [7]. Gassiat [23] recently proved that such a model (under certain correlation regimes) generates true martingales for the spot process. The lack of Markovianity imposes numerous fundamental theoretical questions and practical challenges in order to make rough volatility usable in an industrial environment. On the theoretical side, Jacquier et al. [31] prove a large deviations principle for a rescaled version of the log stock price process. In this same direction, Bayer et al. [8], Forde and Zhang [22], Horvath et al. [26] and most recently Friz et al. [21] (to name a few) prove large deviations principles for a wide range of rough volatility models. On the practical side, competitive simulation methods are developed in Bennedsen et al. [12], Horvath et al. [27] and McCrickerd and Pakkanen [32]. Moreover, recent developments by Stone [35] and Horvath et al. [29] allow the use of neural networks for calibration; their calibration schemes are considerably faster and more accurate than existing methods for rough volatility models.

Crucially, the lack of a pricing PDE imposes a fundamental constraint on the comprehension and interpretation of rough volatility models driven, by Volterra-like Gaussian processes. The only current exception in the rough volatility literature is the rough Heston model, developed by El Euch and Rosenbaum [19, 20], which allows a better understanding through the fractional PDE derived in [20]. Nevertheless, in this work our attention is turned to the class of models for which such a pricing PDE is unknown, and hence further theoretical results are required.

Perhaps, options on volatility itself are the most natural object to first analyse within the class of rough volatility models. In this direction, Jacquier et al. [30] provide algorithms for pricing VIX options and futures. Horvath et al. [28] further study VIX smiles in the presence of stochastic volatility of volatility combined with rough volatility. Nevertheless, the precise effect of model parameters (with particular interest in the Hurst parameter effect) on implied volatility smiles for VIX (or volatility derivatives in general) has not been studied until very recently in Alòs et al. [1].

The main focus of the paper is to derive the small-time behaviour of the realised variance process of the rough Bergomi model, as well as related but more complicated multi-factor rough volatility models, together with the small-time behaviour of options on realised variance. These results, which are interesting from a theoretical perspective, have practical applicability to the quantitative finance industry as they allow practitioners to better understand the Realised Variance smile, as well as the effect of correlation on the smile’s linearity (or possibly convexity). To the best of our knowledge, this is the first paper to study the small-time behaviour of options on realised variance. An additional major contribution of the paper is the numerical scheme used to compute the implied volatility smiles, using an accurate approximation of the rate function from a large deviations principle. In general rate functions are highly non-trivial to compute; our method is simple, intuitive, and accurate. The numerical methods are publicly available on GitHub: LDP-VolOptions.

Volatility options are becoming increasingly popular in the financial industry. For instance, VIX options’ liquidity has consistently increased since its creation by the Chicago Board of Exchange (CBOE). One of the main popularity drivers is that volatility tends to be negatively correlated with the underlying dynamics, making it desirable for portfolio diversification. Due to the appealing nature of volatility options, their modelling has attracted the attention of many academics such as Carr et al. [14], Carr and Lee [16] to name a few.

For a log stock price process X defined as $ X_t = - \frac{1}{2} \int _0^t v_s \mathrm {d}s + \int _0^t \sqrt{ v_s } \mathrm {d}B_s, X_0=0, $ where B is standard Brownian motion, we denote the quadratic variation of X at time t by $\langle X \rangle _t$. Then, the core object to analyse in this setting is the realised variance option with payoff

$$\begin{aligned} \left( \frac{1}{T}\int _0^T \mathrm {d}\langle X \rangle _s -K\right) ^+, \end{aligned}$$

(1.1)

which in turn defines the risk neutral density of the realised variance. In this work, we analyse the short time behaviour of the implied volatility given by (1.1) for a number of (rough) stochastic volatility models by means of large deviation techniques. We specifically focus on the construction of correlated factors and their effect on the distribution of the realised variance. We find our results consistent with that of Alòs et al. [1], which also help us characterise in close-form the implied volatility around the money. Moreover, we also obtain some asymptotic results for VIX options.

While implied volatilities for options on equities are typically convex functions of log-moneyness, giving them their “smile” moniker, implied volatility smiles for options on realised variance tend to be linear. Options on integrated variance are OTC products, and so their implied volatility smiles are not publicly available. VIX smiles are, however, and provide a good proxy for integrated variance smiles; see Fig. 1 below for evidence of their linearity. The data also indicates both a power-law term structure ATM and its skew.

In spite of most of the literature agreeing on the fact that more than a single factor is needed to model volatility (see Bergomi’s [6] two-factor model, Avellaneda and Papanicolaou [3] or Horvath et al. [28] for instance), there is no in-depth analysis on how to construct these (correlated) factors, nor the effect of correlation on the price of volatility derivatives and their corresponding implied volatility smiles. Our aim is to understand multi-factor models and analyse the effect of factors in implied volatility smiles. This paper, to the best of our knowledge is the first to address such questions, which are of great interest to practitioners in the quantitative finance industry; it is also the first to provide a rigorous mathematical analysis of the small-time behaviour of options on integrated variance in rough volatility models.

The structure of the paper is as follows. Section 2 introduces the models, the rough Bergomi model and two closely related processes, whose small-time realised variance behaviour we study; the main results are given in Sect. 3. Section 4 is dedicated to numerical examples of the results attained in Sect. 3. In Sect. 5 we introduce a general variance process, which includes the rough Bergomi model for a specific choice of kernel, and briefly investigate the small-noise behaviour of VIX options in this general setting. Motivated by the numerical examples in Sect. 4, we propose a simple and very feasible approximation for the density of the realised variance for the mixed rough Bergomi model [see (2.5)] in “Appendix A”. The proofs of the main results are given in “Appendix B”; the details of the numerics are given in “Appendix C”.

Notations Let $\mathbb {R}_+ := [0,+\infty )$ and $\mathbb {R}^*_+ := (0,+\infty )$. For some index set $\mathcal {T}\subseteq \mathbb {R}_+$, the notation $L^2(\mathcal {T})$ denotes the space of real-valued square integrable functions on $\mathcal {T}$, and $ \mathcal {C}(\mathcal {T}, \mathbb {R}^d)$ the space of $\mathbb {R}^d$-valued continuous functions on $\mathcal {T}$. ${\mathcal {E}}$ denotes the Wick stochastic exponential.

2 A showcase of rough volatility models

In this section we introduce the models that will be considered in the forthcoming computations. We shall always work on a given filtered probability space $(\Omega ,(\mathscr {F}_t)_{t\ge 0},{\mathbb {P}})$. For notational convenience, we introduce

$$\begin{aligned} Z_t := \int _0^t K_\alpha (s,t)\mathrm {d}W_s, \qquad \text {for any }t \in \mathcal {T}, \end{aligned}$$

(2.1)

where $\alpha \in \left( -\frac{1}{2},0\right) $, W a standard Brownian motion, and where the kernel $K_{\alpha }:\mathbb {R}_+\times \mathbb {R}_+ \rightarrow \mathbb {R}_+$ reads

$$\begin{aligned} K_{\alpha }(s,t) := \eta \sqrt{2\alpha + 1}(t-s)^{\alpha }, \qquad \text {for all } 0\le s<t, \end{aligned}$$

(2.2)

for some strictly positive constant $\eta $. Note that, for any $t\ge 0$, the map $s\mapsto K_\alpha (s,t)$ belongs to $L^2(\mathcal {T})$, so that the stochastic integral (2.1) is well defined. We also define an analogous multi-dimensional version of (2.1) by

$$\begin{aligned} {\mathcal {Z}}_t:=\left( \int _0^t K_\alpha (s,t)\mathrm {d}W^1_s, \ldots , \int _0^t K_\alpha (s,t)\mathrm {d}W^m_s \right) := \left( {\mathcal {Z}}^1_t, \ldots , {\mathcal {Z}}_t^m \right) , \qquad \text {for any }t \in \mathcal {T},\nonumber \\ \end{aligned}$$

(2.3)

where $W^1,\ldots ,W^m$ are independent Brownian motions.

Model 2.1

(Rough Bergomi) The rough Bergomi model, where X is the log stock price process and v is the instantaneous variance process, is then defined (see [7]) as

$$\begin{aligned} \begin{array}{rll} X_t &{} = \displaystyle - \frac{1}{2} \int _0^t v_s \mathrm {d}s + \int _0^t \sqrt{ v_s } \mathrm {d}B_s, \quad &{} X_0 = 0 , \\ v_t &{}= \displaystyle v_0 \exp \left( Z_t -\frac{\eta ^2}{2}t^{2 \alpha +1} \right) , \quad &{} v_0 > 0, \end{array} \end{aligned}$$

(2.4)

where the Brownian motion B is defined as $B := \rho W + \sqrt{1-\rho ^2}W^\perp $ for $\rho \in [-1,1]$ and some standard Brownian motion $W^\perp $ independent of W.

Remark 2.2

The process $\log v$ has a modification whose sample paths are almost surely locally $\gamma $-Hölder continuous, for all $\gamma \in \left( 0, \alpha + \frac{1}{2} \right) $ [31, Proposition 2.2]. For rough volatility models involving a fractional Brownian motion, the sample path regularity of the log volatility process is referred to in terms of the Hurst parameter H; recall that the fractional Brownian motion has sample paths that are $\gamma $-Hölder continuous for any $\gamma \in (0,H)$ [10, Theorem 1.6.1]. By identification, therefore, we have that $\alpha = H - 1/2$.

Model 2.3

(Mixed rough Bergomi) The mixed rough Bergomi model is given in terms of log stock price process X and instantaneous variance process $v^{(\gamma ,\nu )}$ as

$$\begin{aligned} \begin{array}{rll} X_t &{} = \displaystyle - \frac{1}{2} \int _0^t v_s^{(\gamma ,\nu )} \mathrm {d}s + \int _0^t \sqrt{ v_s^{(\gamma ,\nu )} } \mathrm {d}B_s, \quad &{} X_0 = 0 , \\ v_t^{(\gamma ,\nu )} &{}= v_0 \sum _{i=1}^n \gamma _i \exp \left( \frac{\nu _i}{\eta }Z_t - \frac{\nu _i^2}{2}t^{2\alpha +1}\right) ,\quad &{} v_0>0 \end{array} \end{aligned}$$

(2.5)

where $\gamma :=(\gamma _1,\ldots ,\gamma _n)\in [0,1]^n$ such that $\sum _{i=1}^n \gamma _i =1$ and $\nu :=(\nu _1,\ldots ,\nu _n)\in {\mathbb {R}}^n$, such that $ 0<\nu _1<\ldots <\nu _n$.

The above modification of the rough Bergomi model, inspired by Bergomi [5], allows a bigger slope (hence bigger skew) on the implied volatility of variance/volatility options to be created, whilst maintaining a tractable instantaneous variance form. This will be made precise in Sect. 4.2.

Model 2.4

(Mixed multi-factor rough Bergomi) The mixed rough Bergomi model is given in terms of log stock price process X and instantaneous variance process $v^{(\gamma ,\nu ,\Sigma )}$ as

$$\begin{aligned} \begin{array}{rll} X_t &{} = \displaystyle - \frac{1}{2} \int _0^t v_s^{(\gamma ,\nu ,\Sigma )} \mathrm {d}s + \int _0^t \sqrt{ v_s^{(\gamma ,\nu ,\Sigma )} } \mathrm {d}B_s, \quad &{} X_0 = 0 , \\ v_t^{(\gamma ,\nu ,\Sigma )}&{}= v_0 \sum _{i=1}^{n} \gamma _i {\mathcal {E}}\left( \frac{\nu ^i}{\eta }\cdot \mathrm {L}_i {\mathcal {Z}}_t\right) ,\quad &{} v_0>0, \end{array} \end{aligned}$$

(2.6)

where $\gamma :=(\gamma _1,\ldots ,\gamma _n)\in [0,1]^n$ such that $\sum _{i=1}^n \gamma _i =1$. The vector $\nu ^i=(\nu ^i_1,\ldots ,\nu ^i_m)\in {\mathbb {R}}^m$ satisfies $0<\nu ^i_1<\cdots <\nu ^i_m$ for all $i\in \{1,\ldots ,n\}$. In addition, $\mathrm {L}_i\in {\mathbb {R}}^{m\times m}$ is a lower triangular matrix such that $\mathrm {L}_i \mathrm {L}^T_i =: \Sigma _i$ is a positive definite matrix for all $i\in \{1,\ldots ,n\}$, denoting the covariance matrix.

For all results involving models (2.4), (2.5), and (2.6) we fix $\mathcal {T}=[0,1]$; minor adjustments to the proofs yield analogous results for more general $\mathcal {T}$. We additionally define $\beta :=2\alpha +1\in (0,1)$ for notational convenience.

Remark 2.5

In models (2.4)–(2.6) we have considered a flat or constant initial forward variance curve $v_0>0$. However, our framework can be easily extended to functional forms $v_0(\cdot ): \mathcal {T}\mapsto \mathbb {R}_+$ via the Contraction Principle, see “Appendix D”, as long as the mapping is continuous.

Remark 2.6

The reader may already have realised that the mixed multi-factor rough Bergomi defined in (2.6) is indeed general enough to cover both (2.4) and (2.5). However, we provide our theoretical results in an orderly fashion starting from (2.4) and finishing with (2.6), which we find the most natural way to increase the complexity of the model.

In place of $K_\alpha $ in (2.1), one may also consider more general kernels of the form

$$\begin{aligned} G(t,s):=K_{\alpha }(t,s) L(t-s) \end{aligned}$$

where $L\in {\mathcal {C}}(0,\infty )$ is a measurable function such that the stochastic integral is well defined, $L(\cdot ) $ is decreasing and $L(\cdot )$ continuous at 0. We note that under such conditions, $L(\cdot )$ is also slowly varying at zero, i.e. $\lim _{x\rightarrow 0} \frac{L(tx)}{L(x)}=1$ for any $t>0$. Such kernels are naturally related to the class of Truncated Brownian Semistationary ($\mathcal {TBSS}$) processes introduced by Barndorff-Nielsen and Schmiegel [13]. Examples include the Gamma and Power-law kernels:

$$\begin{aligned} L_{\text {Gamma}}(t-s)&=\exp (-\kappa (t-s)),&\kappa >0\\ L_{\text {Power}}(t-s)&=(1+t-s)^{\zeta -\alpha },&\zeta <-1. \end{aligned}$$

The following result gives the exponential equivalence between the sequences of the rescaled stochastic integrals of $K_\alpha $ and G, thus it is completely justified to only consider the case $K_{\alpha }$, without any loss of generality.

Proposition 2.7

The sequences of processes $\left( \varepsilon ^{\beta / 2 }\int _0^\cdot K_\alpha (\cdot ,s) L(\varepsilon (\cdot -s))\mathrm {d}W_s\right) _{\varepsilon >0}$ and $\left( \varepsilon ^{\beta / 2}Z_\cdot \right) _{\varepsilon >0}$ are exponentially equivalent.

Proof

As L is a slowly varying function, the so-called Potter bounds [9, Theorem 1.5.6, page 25] hold on the interval (0, 1]: indeed, for all $\xi >0$, there exist constants $0< {\underline{C}}^\xi \le {\overline{C}}^\xi $ such that

$$\begin{aligned} {\underline{C}}^\xi (\varepsilon x)^\xi< L(\varepsilon x) < {\overline{C}}^\xi (\varepsilon x)^\xi , \quad \text {for all } (\varepsilon x)\in (0,1]. \end{aligned}$$

In particular, for $\varepsilon >0$ such that $(\varepsilon x)\in [0,1]$, $L(\varepsilon x) -1 \le K_\xi (\varepsilon x)^\xi $ where $K_\xi < \infty $ as $\xi >0$. Thus, for all $\delta >0$,

$$\begin{aligned} \begin{aligned}\displaystyle&\mathbb {P}\left( \left\| \varepsilon ^{\beta / 2} \int _0^\cdot K_\alpha (\cdot , s)\left[ L(\varepsilon (\cdot -s))-1 \right] \mathrm {d}W_s \right\| _{\infty } \!\!\!\!> \delta \right) \\&\quad = \mathbb {P}\left( |{\mathcal {N}}(0,1)| > \frac{\delta }{\varepsilon ^{\beta / 2} \left\| V_\varepsilon \right\| _{\infty }} \right) \\&\quad = \sqrt{\frac{2}{\pi }} \frac{\varepsilon ^{\beta / 2} \left\| V_\varepsilon \right\| _{\infty }}{\delta } \exp \left( \frac{-\delta ^2}{2\varepsilon ^{\beta } \left\| V_\varepsilon \right\| ^2_{\infty }}\right) (1+{\mathcal {O}}(\varepsilon ^{\beta })), \end{aligned} \end{aligned}$$

where the final equality follows by using the asymptotic expansion of the Gaussian density near infinity [4, Formula (26.2.12)], and $V^2_\varepsilon (t) := \int _0^t K^2_\alpha (t,s) \left[ L(\varepsilon (t-s)) - 1\right] ^2 \mathrm {d}s$, Then,

$$\begin{aligned} V^2_\varepsilon (t) \le K^2_\xi \varepsilon ^{2\xi } \eta ^2 (2\alpha +1) \int _0^t (t-s)^{2(\alpha + \xi )} \mathrm {d}s = \left( K_\xi \eta \right) ^2 \frac{2\alpha +1}{2(\alpha + \xi )+1} \varepsilon ^{2\xi } t^{2(\alpha + \xi ) +1}, \end{aligned}$$

so that $\lim _{\varepsilon \downarrow 0} \left\| V^2_\varepsilon \right\| _\infty =0$, as well as $\lim _{\varepsilon \downarrow 0} \left\| V_\varepsilon \right\| _\infty =0$. Therefore

$$\begin{aligned}&\varepsilon ^\beta \log \mathbb {P}\left( \left\| \varepsilon ^{\beta / 2} \int _0^\cdot K_\alpha (\cdot , s) \left[ L(\varepsilon (\cdot -s)) -1 \right] \mathrm {d}W_s \right\| _\infty > \delta \right) \\&\quad \le \frac{\varepsilon ^\beta }{2} \log \left( \frac{2}{\pi }\right) + \varepsilon ^\beta \log \left( \frac{\varepsilon ^{\beta / 2} \left\| V_\varepsilon \right\| _\infty }{\delta }\right) \\&\qquad - \frac{\delta ^2}{2\left\| V_\varepsilon \right\| ^2_\infty } + \varepsilon ^\beta \log \left( 1+{\mathcal {O}})\varepsilon ^\beta \right) . \end{aligned}$$

As $\varepsilon $ tends to zero, the first and second terms in the above inequality tend to zero (recall that $0<\beta <1$), and the third term tends to $-\infty $. Hence for all $\delta >0$,

$$\begin{aligned} \limsup _{\varepsilon \downarrow 0} \varepsilon ^\beta \log \mathbb {P}\left( \left\| \varepsilon ^{\beta / 2} \int _0^\cdot K_\alpha (\cdot , s) \left[ L(\varepsilon (\cdot -s)) -1 \right] \mathrm {d}W_s \right\| _\infty > \delta \right) =- \infty , \end{aligned}$$

and thus the two processes are exponentially equivalent [18, Definition 4.2.10]; see “Appendix D”. $\square $

Corollary 2.8

The sequences of processes $(\varepsilon ^{\beta /2} \int _0^t \mathrm {e}^{-\kappa _i \varepsilon (t-s)} K_\alpha (s,t)\mathrm {d}W^i)_{\varepsilon >0}$ and $(\varepsilon ^{\beta /2} \int _0^t K_\alpha (s,t)\mathrm {d}W^i)_{\varepsilon >0}$ are exponentially equivalent for $i=1,\ldots ,m$, where each $\kappa _i>0$.

Proof

The proof is similar to the proof of Proposition 2.7. The variance in the asymptotic expansion of the Gaussian density near infinity [4, Formula (26.2.12)] is defined as

$$\begin{aligned} V^2_\varepsilon (t):= & {} \int _0^t \left[ \mathrm {e}^{\kappa _j \varepsilon (t -s)} -1 \right] ^2 K^2_\alpha (s,t) \mathrm {d}s \\= & {} \eta ^2 \beta \left[ \frac{t^\beta }{\beta } -2\int _0^t \mathrm {e}^{-\kappa _j \varepsilon (t-s)} (t-s)^{2\alpha } \mathrm {d}s + \int _0^t \mathrm {e}^{-2\kappa _j \varepsilon (t-s)} (t-s)^{2\alpha } \mathrm {d}s\right] , \end{aligned}$$

such that

$$\begin{aligned} 0 <\varepsilon ^\beta V^2_\varepsilon \le \eta ^2 \beta \varepsilon ^\beta \left[ \frac{t^\beta }{\beta } + \int _0^t \mathrm {e}^{-2\kappa _j \varepsilon (t-s)} (t-s)^{2\alpha } \mathrm {d}s \right] \le 2 \eta ^2 (\varepsilon t)^\beta , \end{aligned}$$

and therefore $\lim _{\varepsilon \downarrow 0} V^2_\varepsilon = 0$ and $\lim _{\varepsilon \downarrow 0} \varepsilon ^\beta \left\| V_\varepsilon \right\| ^2_\infty = 0$. Then, for all $\delta >0$,

$$\begin{aligned} \limsup _{\varepsilon \downarrow 0} \varepsilon ^\beta \log {\mathbb {P}} \left( \left\| \varepsilon ^{\beta /2} \int _0^{\cdot } K_{\alpha }(s,\cdot ) \left[ \exp (-\kappa _j \varepsilon (\cdot -s))-1\right] \mathrm {d}W^j_s\right\| _{\infty } > \delta \right) = -\infty , \end{aligned}$$

ans thus the two processes are exponentially equivalent [18, Definition 4.2.10]. $\square $

3 Small-time results for options on integrated variance

We start our theoretical analysis by considering options on realised variance, which we also refer to as integrated variance and RV interchangeably. We recall that volatility is not directly observable, nor a tradeable asset. Options on realised variance, however, exist and are traded as OTC products. Below are two examples of the payoff structure of such products:

$$\begin{aligned} (i) (RV(v)(T)-K)^+, \quad (ii) (\sqrt{RV(v)(T)}-K)^+, \quad \text {where } T,K \ge 0. \end{aligned}$$

(3.1)

where we define the following $\mathcal {C}(\mathcal {T})$ operator

$$\begin{aligned} RV(f)(\cdot ):f \mapsto \frac{1}{\cdot } \int _0^\cdot f(s) \mathrm {d}s, \quad RV(f)(0):=f(0), \end{aligned}$$

(3.2)

and v represents the instantaneous variance in a given stochastic volatility model. Note that $RV(v)(0)~=~v_0$.

Remark 3.1

As shown by Neuberger [33], we may rewrite the variance swap in terms of the log contract as

$$\begin{aligned} {\mathbb {E}}[RV(v)(T)]={\mathbb {E}}\left[ \frac{1}{T}\int _0^T v_s \mathrm {d}s\right] ={\mathbb {E}}\left[ -2\frac{X_T}{T}\right] \end{aligned}$$

(3.3)

where ${\mathbb {E}}[\cdot ]$ is taken under the risk-neutral measure and $S=\exp (X)$ is a risk-neutral martingale (assuming interest rates and dividends to be null). Therefore, the risk neutral pricing of RV(v)(T) or options on it is fully justified by (3.3).

3.1 Small-time results for the rough Bergomi model

Proposition 3.2

The set ${\mathscr {H}}^{K_{\alpha }}:=\{\int _0^{\cdot } K_{\alpha }(s,\cdot )f(s) \mathrm {d}s : f\in L^2(\mathcal {T}) \} $ defines the reproducing kernel Hilbert space for Z with inner product $\langle \int _0^{\cdot } K_{\alpha }(s,\cdot )f_1(s)\mathrm {d}s, \int _0^{\cdot } K_{\alpha }(s,\cdot )f_2(s)\mathrm {d}s \rangle _{{\mathscr {H}}^{K_{\alpha }}} = \langle f_1, f_2 \rangle _{L^2(\mathcal {T})} $.

Proof

See [31, Theorem 3.1]. $\square $

Before stating Theorem 3.3, we define the following function $\Lambda ^{Z}: \mathcal {C}(\mathcal {T}) \rightarrow \mathbb {R}_+$ as $ \Lambda ^{Z}(\mathrm {x}):= \frac{1}{2}\Vert \mathrm {x}\Vert _{{\mathscr {H}}^{K_\alpha }}^2 $, and if $\mathrm {x}\notin {\mathscr {H}}^{K_{\alpha }}$ then $\Lambda ^{Z}(\mathrm {x}) = + \infty $.

Theorem 3.3

The variance process $(v_\varepsilon )_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ as $\varepsilon $ tends to zero, with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^v(x):= \Lambda ^Z\left( \log \left( \frac{x}{v_0} \right) \right) $, where $\Lambda ^v(v_0)=0$ and $x\in {\mathcal {C}}({\mathcal {T}})$.

Proof

To ease the flow of the paper the proof is postponed to “Appendix B.1”. $\square $

Corollary 3.4

The integrated variance process $\left( RV(v)(t)\right) _{t \in \mathcal {T}}$ satisfies a large deviations principle on $\mathbb {R}^*_+$ as t tends to zero, with speed $t^{-\beta }$ and rate function ${\hat{\Lambda }}^v$ defined as ${\hat{\Lambda }}^v(\mathrm {y}) := \inf \left\{ \Lambda ^v(\mathrm {x}) : \mathrm {y}= RV(\mathrm {x})(1) \right\} $, where ${\hat{\Lambda }}^v(v_0)=0$.

Proof

As proved in Theorem 3.3, the process $(v_\varepsilon )_{\varepsilon >0}$ satisfies a pathwise large deviations principle on $\mathcal {C}(\mathcal {T})$ as $\varepsilon $ tends to zero. For small perturbations $\delta ^v \in \mathcal {C}(\mathcal {T})$, we have

$$\begin{aligned} \left\| RV(v+\delta ^v)(t)-RV(\delta ^v)(t)\right\| _\infty \le \sup _{t\in \mathcal {T}} \frac{1}{t} \left| \int _0^t \delta ^v(s) \mathrm {d}s \right| \le M, \end{aligned}$$

where $M= \sup _{t\in \mathcal {T}} |\delta ^v(t)|$, which is finite as $\delta ^v \in \mathcal {C}(\mathcal {T})$. Clearly M tends to zero as $ \delta ^v$ tend to zero, and hence the operator RV is continuous with respect to the sup norm on $\mathcal {C}(\mathcal {T})$. Therefore we can apply the Contraction Principle (see “Appendix D”), and consequently the integrated variance process $RV(v_\varepsilon )$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ as $\varepsilon $ tends to zero. Clearly $RV(v_\varepsilon )(t)=RV(v)(\varepsilon t)$, for all $t\in \mathcal {T}$, and so setting $t=1$ and mapping $\varepsilon $ to t then yields the result. By definition, ${\hat{\Lambda }}^v(\mathrm {y}) := \inf \left\{ \Lambda ^v(\mathrm {x}) : \mathrm {y}= RV(\mathrm {x})(1) \right\} $. If we choose $\mathrm {x}\equiv v_0$ then clearly $v_0=RV(\mathrm {x})(1)$, and $\Lambda ^v(\mathrm {x})=0$. Since $\Lambda ^v$ is a norm, it is a non-negative function and therefore ${\hat{\Lambda }}^v(v_0)=0$. This concludes the proof. $\square $

Remark 3.5

Corollary 3.4 can be applied to a large number of existing results on large deviations for (rough) variance processes to get a large deviations result for the integrated (rough) variance process; for example Forde and Zhang [22] and Horvath et al. [26].

Corollary 3.6

The rate function ${\hat{\Lambda }}^v$ is continuous.

Proof

Indeed, as a rate function, ${\hat{\Lambda }}^v$ is lower semi-continuous. Moreover, as $\Lambda ^v$ is continuous, one can use similar arguments to [22, Corollary 4.6], and deduce that ${\hat{\Lambda }}^v$ is upper semi-continuous, and hence is continuous. $\square $

Before stating results on the small-time behaviour of options on integrated variance, we state that the integrated variance process RV(v) satisfies a large deviations principle on $\mathbb {R}$ as t tends to zero, with speed $t^{-\beta }$ and rate function ${\hat{\Lambda }}^v(\mathrm {e}^\cdot )$. Then, the small-time behaviour of such options can be obtained as an application of Corollary 3.4.

Corollary 3.7

For log moneyness $k:=\log \frac{K}{RV(v)(0)} \ne 0$, the following equality holds true for Call options on integrated variance

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( RV(v)(t)-\mathrm {e}^k\right) ^+\right] = - \mathrm {I}(k), \end{aligned}$$

(3.4)

where $\mathrm {I}$ is defined as as $\mathrm {I}(x) := \inf _{y>x} {\hat{\Lambda }}^v(\mathrm {e}^y)$ for $x >0$, $\mathrm {I}(x) := \inf _{y<x} {\hat{\Lambda }}^v(\mathrm {e}^y)$ for $x <0$.

Similarly, for log moneyness $k:=\log \frac{K}{\sqrt{RV(v)(0)}} \ne 0$,

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( \sqrt{RV(v)(t)}-\mathrm {e}^k\right) ^+\right] = -\bar{\mathrm {I}}(k), \end{aligned}$$

(3.5)

where $\bar{\mathrm {I}}$ is defined analogously as $\bar{\mathrm {I}}(x) := \inf _{y>x} {\hat{\Lambda }}^v(\mathrm {e}^{2y})$ for $x >0$ and $\bar{\mathrm {I}}(x) := \inf _{y<x} {\hat{\Lambda }}^v(\mathrm {e}^{2y})$ for $x<0$.

Proof

The proof is postponed to “Appendix B.2”. $\square $

As with Call options on stock price processes, we can define and study the implied volatility of such options. In the case of (3.1)(i) we define the implied volatility ${\hat{\sigma }}(T,k)$ to be the solution to

$$\begin{aligned} \mathbb {E}[(RV(v)(T)-e^k)^+ ]= C_{BS}( RV(v)(0), k, T, {\hat{\sigma }}(T,k) ), \end{aligned}$$

(3.6)

where $C_{BS}$ denotes the Call price in the Black-Scholes model. Using Corollary 3.7, we deduce the small-time behaviour of the implied volatility ${\hat{\sigma }}$, as defined in (3.6).

Corollary 3.8

The small-time asymptotic behaviour of the implied volatility is given by the following limit, for a log moneyness $k \ne 0$:

$$\begin{aligned} \lim _{t \downarrow 0} t^{1-\beta } {\hat{\sigma }}^2(t,k)=:{\hat{\sigma }}^2(k) = \frac{k^2}{2\mathrm {I}(k)}. \end{aligned}$$

Proof

The log integrated variance process $\log RV(v)$ satisfies a large deviations principle with speed $t^{-\beta }$ and rate function ${\hat{\Lambda }}^v(\mathrm {e}^\cdot )$, which is continuous. Therefore, it follows that

$$\begin{aligned} \lim _{t\downarrow 0} t^\beta \log \mathbb {P}(RV(v)(t) \ge \mathrm {e}^k) = -\mathrm {I}(k). \end{aligned}$$

In the Black Scholes model, i.e. a geometric Brownian motion with $S_0=RV(v)(0)$ with constant volatility $\xi $, we have the following small-time implied volatility behaviour:

$$\begin{aligned} \lim _{t \downarrow 0} \xi ^2 t \log \mathbb {P}(RV(v)(t) \ge \mathrm {e}^k) = -\frac{k^2}{2}. \end{aligned}$$

We then apply Corollary 3.7 and [25, Corollary 7.1], identifying $\xi \equiv {\hat{\sigma }}(k,t)$, to conclude. $\square $

Remark 3.9

Notice that the level of implied volatility in Corollary 3.8 has a power law behaviour as a function of time to maturity. This power law is of order $\sqrt{t^{\beta -1}}$, which is consistent with the at-the-money RV implied volatility results by Alòs et al. [1], where the order is found to be $t^{H-1/2}$ using Malliavin Calculus techniques. Recall that $\beta =2\alpha +1$, and $\alpha =H-1/2$ by Remark 2.2.

3.2 Small-time results for the mixed rough Bergomi model

Minor adjustments to Theorem 3.3 give the following small-time result for the mixed variance process $v^{(\gamma ,\nu )}$ introduced in Model (2.5); the proof is given in “Appendix B.3”.

Theorem 3.10

The mixed variance process $( v_\varepsilon ^{(\gamma ,\nu )})_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ with speed $\varepsilon ^{-\beta }$ and rate function

$$\begin{aligned} \Lambda ^{(\gamma ,\nu )}(\mathrm {x}):= \inf \left\{ \Lambda ^Z\left( \frac{\eta }{\nu _1} \mathrm {y}\right) : \mathrm {x}(\cdot ) =v_0 \sum _{i=1}^n \gamma _i e^{\frac{\nu _i}{\nu _1}\mathrm {y}(\cdot )} \right\} , \end{aligned}$$

satisfying $\Lambda ^{(\gamma ,\nu )}(v_0)~=~0$.

By Remark 3.5, we immediately get the following result for the small-time behaviour of the integrated mixed variance process $RV(v^{(\gamma ,\nu )})$.

Corollary 3.11

The integrated mixed variance process $(RV\left( v^{(\gamma ,\nu )}\right) (t))_{t\in \mathcal {T}}$ satisfies a large deviations principle on $\mathbb {R}^*_+$ as t tends to zero, with speed $t^{-\beta }$ and rate function ${\tilde{\Lambda }}^{(\gamma ,\nu )}(\mathrm {y}) := \inf \left\{ \Lambda ^{(\gamma ,\nu )}(\mathrm {x}) : \mathrm {y}= RV(\mathrm {x})(1) \right\} $, where ${\tilde{\Lambda }}^{(\gamma ,\nu )}(v_0)=0$.

To get the small-time implied volatility result, analogous to Corollary 3.8, we need the following Lemma, which is used in place of (B.4). The remainder of the proof then follows identically.

Lemma 3.12

For all $t\in \mathcal {T}$ and $q>1$ we have

$$\begin{aligned} \mathbb {E}\left[ \left( RV\left( v^{(\gamma ,\nu )}\right) (t) \right) ^q \right] \le \frac{v^q_0 n^q}{t^{q-1}} \exp \left( \frac{q^2(\nu ^*)^2}{2\eta ^2}\left( q^2-q\right) t^{2\alpha +1}\right) , \end{aligned}$$

where $\nu ^*=\max \{\nu _1,\ldots ,\nu _n\}$.

Proof

First we note that by Hölder’s inequality $(\sum _{i=1}^n x_i)^q\le n^{q-1} \sum _{i=1}^n (x_i)^q$, for $x_i>0$. Since, $\gamma _i\le 1$ for $i=1,\ldots ,n$, we obtain

$$\begin{aligned} \left( RV\left( v^{(\gamma ,\nu )}\right) (t)\right) ^q\le & {} \frac{v_0^q}{t^q} n^{q-1} \sum _{i=1}^n\int _0^t {\mathbb {E}}\left[ \exp \left( \frac{ q \nu _i}{\eta }Z_s-\frac{q\nu _i^2}{2 \eta ^2 }s^{2\alpha +1}\right) \right] \mathrm {d}s\\\le & {} \frac{v_0^q}{t^{q-1}} n^{q-1} \sum _{i=1}^n \exp \left( \frac{\nu _i^2}{2 \eta ^2 }\left( q^2-q \right) t^{2\alpha +1}\right) . \end{aligned}$$

Choosing $\nu ^*=\max \{\nu _1,\ldots ,\nu _n\}$ the result directly follows. $\square $

Corollary 3.13

For log moneyness $k:=\log \frac{K}{RV(v^{(\gamma ,\nu )})(0)} \ne 0$, the following equality holds true for Call options on integrated variance in the mixed rough Bergomi model:

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( RV(v^{(\gamma ,\nu )})(t)-\mathrm {e}^k\right) ^+\right] = - \mathrm {I}(k), \end{aligned}$$

(3.7)

where $\mathrm {I}$ is defined as $\mathrm {I}(x) := \inf _{y>x} {\tilde{\Lambda }}^{(\gamma ,\nu )}(\mathrm {e}^y)$ for $x >0$, $\mathrm {I}(x) := \inf _{y<x} {\tilde{\Lambda }}^{(\gamma ,\nu )}(\mathrm {e}^y)$ for $x <0$.

Similarly, for log moneyness $k:=\log \frac{K}{\sqrt{RV(v^{(\gamma ,\nu )})(0)}} \ne 0$,

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( \sqrt{RV(v^{(\gamma ,\nu )})(t)}-\mathrm {e}^k\right) ^+\right] = -\bar{\mathrm {I}}(k), \end{aligned}$$

(3.8)

where $\bar{\mathrm {I}}$ is defined analogously as $\bar{\mathrm {I}}(x) := \inf _{y>x} {\tilde{\Lambda }}^{(\gamma ,\nu )}(\mathrm {e}^{2y})$ for $x >0$ and $\bar{\mathrm {I}}(x) := \inf _{y<x} {\tilde{\Lambda }}^{(\gamma ,\nu )}(\mathrm {e}^{2y})$ for $x<0$.

Proof

Follows directly from Lemma 3.12 and proof of Corollary 3.7. $\square $

The small-time implied volatility behaviour for the mixed rough Bergomi model is then given by Corollary 3.8, where the function $\mathrm {I}$ is defined in terms of the rate function ${\tilde{\Lambda }}^{(\gamma ,\nu )}$, as in Corollary 3.13, in this case.

3.3 Small-time results for the multi-factor rough Bergomi model

The small-time behaviour of the multi-factor rough Bergomi model (2.6) can then be obtained; see Theorem 3.14; note that $\Lambda ^m$ is the rate function associated to the reproducing kernel Hilbert space of the measure induced by $(W_1,\cdots ,W_m)$ on ${\mathcal {C}}(\mathcal {T},\mathbb {R}^m)$. The proof is given in “Appendix B.4”.

Theorem 3.14

The variance process in the multi-factor rough Bergomi model $\left( v^{(\gamma ,\nu ,\Sigma )}_t\right) _{t\in \mathcal {T}}$ satisfies a large deviations principle on $\mathbb {R}^*_+$ with speed $t^{-\beta }$ and rate function

$$\begin{aligned} \Lambda ^{(\gamma ,\nu ,\Sigma )} (\mathrm {y}) = \inf \left\{ \Lambda ^m(\mathrm {x}) : \mathrm {x} \in {\mathcal {H}}_m, \mathrm {y} = v_0 \sum _{i=1}^n \gamma _i \exp \left( \frac{\nu ^i}{\eta } \cdot \mathrm {L}_i \mathrm {x}(1)\right) \right\} , \end{aligned}$$

satisfying $\Lambda ^{(\gamma ,\nu ,\Sigma )} (v_0) = 0$.

As with the mixed variance process, Remark 3.5 gives us the following small-time result for $RV(v^{(\gamma ,\nu ,\Sigma )})$ straight off the bat.

Corollary 3.15

The integrated variance process $(RV\left( v^{(\gamma ,\nu ,\Sigma )}\right) (t))_{t\in \mathcal {T}}$ in the multi-factor Bergomi model satisfies a large deviations principle on $\mathbb {R}^*_+$ as t tends to zero, with speed $t^{-\beta }$ and rate function

$$\begin{aligned} {\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(\mathrm {y}) := \inf \left\{ \Lambda ^{(\gamma ,\nu ,\Sigma )}(\mathrm {x}) : \mathrm {y}= RV(\mathrm {x})(1) \right\} , \end{aligned}$$

where ${\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(v_0)=0$.

We now establish the small-time behaviour for Call options on realised variance in Corollary 3.17, by adapting the proof of Corollary 3.7 as in the previous subsection. To do so we use Lemma 3.16 in place of (B.4). Then we attain the small-time implied volatility behaviour for the multi-factor rough Bergomi model in Corollary 3.8, where the function $\mathrm {I}$ is given by Corollary 3.17.

Lemma 3.16

For all $t\in \mathcal {T}$ and $q>1$ we have

$$\begin{aligned} \mathbb {E}\left[ \left( RV\left( v^{(\gamma ,\nu ,\Sigma )}\right) (t) \right) ^q \right] \le \frac{v^q_0 n^q}{t^{q-1}} \exp \left( \frac{(\nu ^*)^2}{2 \eta ^2 }\left( q^2-q \right) t^{2\alpha +1}\right) , \end{aligned}$$

where $\nu ^*=\max \{\nu _1,\ldots ,\nu _n\}$

Proof

First we note that by Hölder’s inequality $(\sum _{i=1}^n x_i)^q\le n^{q-1} \sum _{i=1}^n (x_i)^q$, for $x_i>0$. Since, $\gamma _i\le 1$ for $i=1,\ldots ,n$, we obtain

$$\begin{aligned} \left( RV\left( v^{(\gamma ,\nu ,\Sigma )}\right) (t)\right) ^q\le & {} \frac{v_0^q}{t^q} n^{q-1} \sum _{i=1}^n\int _0^t {\mathbb {E}}\left[ {\mathcal {E}}\left( \frac{\nu ^i}{\eta }\cdot \mathrm {L}_i {\mathcal {Z}}_s\right) ^q \right] \mathrm {d}s\\\le & {} \frac{v_0^q}{t^{q-1}} n^{q-1} \sum _{i=1}^n \exp \left( \frac{\nu _i^2}{2 \eta ^2 }\left( q^2-q \right) t^{2\alpha +1}\right) . \end{aligned}$$

Choosing $\nu ^*=\max \{\nu _1,\ldots ,\nu _n\}$ the result directly follows. $\square $

Corollary 3.17

For log moneyness $k:=\log \frac{K}{RV(v^{(\gamma ,\nu ,\Sigma )})(0)} \ne 0$, the following equality holds true for Call options on integrated variance in the multi-factor rough Bergomi model:

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( RV(v^{(\gamma ,\nu ,\Sigma )})(t)-\mathrm {e}^k\right) ^+\right] = - \mathrm {I}(k), \end{aligned}$$

(3.9)

where $\mathrm {I}$ is defined as $\mathrm {I}(x) := \inf _{y>x} {\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(\mathrm {e}^y)$ for $x >0$, $\mathrm {I}(x) := \inf _{y<x} {\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(\mathrm {e}^y)$ for $x <0$.

Similarly, for log moneyness $k:=\log \frac{K}{\sqrt{RV(v^{(\gamma ,\nu ,\Sigma )})(0)}} \ne 0$,

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ \left( \sqrt{RV(v^{(\gamma ,\nu ,\Sigma )})(t)}-\mathrm {e}^k\right) ^+\right] = -\bar{\mathrm {I}}(k), \end{aligned}$$

(3.10)

where $\bar{\mathrm {I}}$ is defined analogously as $\bar{\mathrm {I}}(x) := \inf _{y>x} {\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(\mathrm {e}^{2y})$ for $x >0$ and $\bar{\mathrm {I}}(x) := \inf _{y<x} {\tilde{\Lambda }}^{(\gamma ,\nu ,\Sigma )}(\mathrm {e}^{2y})$ for $x<0$.

Proof

Follows directly from Lemma 3.16 and the proof of Corollary 3.7. $\square $

4 Numerical results

In this section we present numerical results for each of the three models given in Sect. 2. We also analyse the effect of each parameters in the implied volatility smile. Numerical experiments and codes are provided on GitHub: LDP-VolOptions .

4.1 RV smiles for rough Bergomi

We begin with numerical results for the rough Bergomi model (2.4) using Corollary 3.8. For the detailed numerical method we refer the reader to “Appendix C”. In Fig. 3, we represent the rate function given in Corollary 3.4, which is the fundamental object to compute numerically. In particular, we notice that ${\hat{\Lambda }}^v$ is convex; a rigorous mathematical proof of this statement is left for future research.

More interestingly, in Fig. 2 we provide a comparison of Corollary 3.8 with respect to a benchmark generated by Monte Carlo simulations, and see all smiles to follow a linear trend. In particular, we notice that Corollary 3.8 provides a surprisingly accurate estimate, even for relatively large maturities. As a further numerical check, in Fig. 4 we compare our results with the close-form at-the-money asymptotics given by Alòs et al. [1] and once again find the correct convergence, suggesting a consistent numerical framework.

4.2 RV smiles for mixed rough Bergomi

We now consider the mixed rough Bergomi model (2.5) in a simplified form given by $v_t=v_0 \left( \gamma _1 {\mathcal {E}}(\nu _1 Z_t)+\gamma _2 {\mathcal {E}}(\nu _2 Z_t)\right) $. In Fig. 5, we observe that a constraint of the type $\gamma _1\nu _1+\gamma _2\nu _2=2$ in the mixed variance process (2.5) allows us to fix the at-the-money implied volatility at a given level, whilst generating different slopes for different values of $(\nu _1,\nu _2,\gamma _1,\gamma _2)$; as in Fig. 2, we see that the smiles generated follow a linear trend. This is again consistent with the results found in [1].

Remark 4.1

At this point it is important to note that the mixed rough Bergomi model 2.5 allows both the at-the-money implied volatility and its skew to be controlled through $(\gamma ,\nu )$, whilst in the rough Bergomi model (2.4) there is not enough freedom to arbitrarily fit both quantities.

4.3 Linearity of smiles and approximation of the RV density

Remarkably, we observe a linear pattern in Figs. 2, 3, 4 and 5 for around the money strikes in the (mixed) rough Bergomi model. By assuming this linear implied volatility smile in log-moneyness space, in “Appendix A” we are able to provide an approximation scheme for the realised variance density $\psi _{RV}(x,T),\; x,T\ge 0$ (see Proposition A.3) . This in turn, allows us to compute the price of a volatility swap by using,

$$\begin{aligned} {\mathcal {P}}_{VolSwap(T)}=\int _0^{\infty } \sqrt{x}\psi _{RV}(x,T) \mathrm {d}x, \end{aligned}$$

where ${\mathcal {P}}_{VolSwap(T)}$ is the price of a Volatility Swap with maturity T. Figure 9 presents the results of such approximation technique that turns out to be surprisingly accurate. We believe that this approximation could be useful to practitioners and more details are provided in “Appendix A”.

4.4 RV smiles for mixed multi-factor rough Bergomi

We conclude our analysis by introducing the correlation effect in the implied volatility smiles, by considering the mixed multi-factor rough Bergomi model (2.6). We shall consider the following two simplified models for instantaneous variance

$$\begin{aligned} v_t&={\mathcal {E}}\left( \nu \int _0^t (t-s)^\alpha \mathrm {d}W_s\right. \left. + \eta \left( \rho \int _0^t (t-s)^\alpha \mathrm {d}W_s+\sqrt{1-\rho ^2}\int _0^t (t-s)^\alpha \mathrm {d}W^\perp _s\right) \right) , \end{aligned}$$

(4.1)

$$\begin{aligned} v_t&=\frac{1}{2}\left( {\mathcal {E}}\left( \nu \int _0^t (t-s)^\alpha \mathrm {d}W_s\right) \right. \nonumber \\&\quad \left. +\,{\mathcal {E}}\left( \eta \left( \rho \int _0^t (t-s)^\alpha \mathrm {d}W_s+\sqrt{1-\rho ^2}\int _0^t (t-s)^\alpha \mathrm {d}W^\perp _s\right) \right) \right) , \end{aligned}$$

(4.2)

where W and $W^\perp $ are independent standard Brownian motions and $\nu ,\eta >0$.

On one hand, Fig. 6 shows the implied volatility smiles corresponding to (4.1). We conclude that adding up correlated factors inside the exponential does not change the behaviour of implied volatility smiles, and they still have a linear form around the money. Moreover, in this context [1] results still hold and we provide the asymptotic benchmark in Fig. 6 to support our numerical scheme. On the other hand, Fig. 7 shows the implied volatility smiles corresponding to (4.2), which are evidently non-linear around the money in the negatively correlated cases. Consequently, we can see that having a sum of exponentials, each one driven by a different (fractional) Brownian motion does indeed affect the behaviour of the convexity in the implied volatility around the money. We further superimpose a linear trend on top of the smiles in Fig. 8 to clearly show the effect of correlation in the convexity of the smiles.

Remark 4.2

Note here that models such as the 2 Factor Bergomi (and its mixed version) [5, 6] have the same small-time limiting behaviour as (4.1) due to Proposition 2.7, where we set $L=L_{\text {Gamma}}$ and $\alpha =0$, meaning that their smiles follow a linear trend. Despite the empirical evidence given in Fig. 1, the commitment to linear smiles from a modelling perspective is strong. The simplified mixed multi-factor rough Bergomi model (4.2), however, is sufficiently flexible to generate both linear ($\rho \ge 0$) and nonlinear ($\rho < 0$) smiles.

5 Options on VIX

Although options on realised variance are the most natural core modelling object for stochastic volatility models, in practice the most popular variance derivative is the VIX. In this section we therefore turn our attention to the VIX and VIX options and study their asymptotic behaviour. For this section, we fix $\mathcal {T}:= [0,T]$. Let us now consider the following general model $(v_t)_{t\ge 0}$ for instantaneous variance:

$$\begin{aligned} v_t=\xi _0(t){\mathcal {E}}\left( \int _0^t g(t,s) \mathrm {d}W_s\right) . \end{aligned}$$

(5.1)

Then, the VIX process is given by

$$\begin{aligned} \text {VIX}_T=\sqrt{\frac{1}{\Delta }\int _T^{T+\Delta }{\mathbb {E}}[v_t|{\mathcal {F}}_T] \mathrm {d}t}. \end{aligned}$$

We introduce the following stochastic process $(V^{g,T})_{t \in [T,T+\Delta ]}$, for notational convenience, as

$$\begin{aligned} V^{g,T}_t:= \int _0^T g(t,s)\mathrm {d}W_s, \end{aligned}$$

(5.2)

and assume that the mapping $s \mapsto g(t,s) $ is in $L^2[0,T]$ for all $t \in [T,T+\Delta ]$ such that the stochastic integral in (5.2) is well-defined.

Proposition 5.1

The VIX dynamics in model (5.1), where the volatility of volatility is denoted by $\nu $, are given by

$$\begin{aligned} \text {VIX}_{T, \nu }^2 := \frac{1}{\Delta }\int _T^{T+\Delta }\xi _0(t)\exp \left( \nu V_t^{g,T}-\frac{\nu ^2}{2}{\mathbb {E}}[(V_t^{g,T})^2] \right) \mathrm {d}t. \end{aligned}$$

Proof

Follows directly from [30, Proposition 3.1]. $\square $

We now define the following $L^2[0,T]$ operator $\mathcal {I}^{g,T}:L^2[0,T] \rightarrow \mathcal {C}[T,T+\Delta ]$, and space $\mathscr {H}^{g,T}$ as

$$\begin{aligned} \mathcal {I}^{g,T}f(\cdot ):= \int _0^T g(\cdot ,s) f(s) \mathrm {d}s, \quad \mathscr {H}^{g,T}:= \{\mathcal {I}^{g,T}f: f\in L^2[0,T] \}, \end{aligned}$$

(5.3)

where the space $\mathscr {H}^{g,T}$ is equipped with the following inner product $ \langle \mathcal {I}^{g,T}f_1, \mathcal {I}^{g,T}f_2 \rangle _{\mathscr {H}^{g,T}}:= \langle f_1, f_2, \rangle _{L^2[0,T]}. $ Note that the function g must be such that the operator $\mathcal {I}^{g,T}$ is injective so that that inner product $\langle \cdot , \cdot \rangle _{\mathscr {H}^{g,T}}$ on $\mathscr {H}^{g,T}$ is well-defined.

Proposition 5.2

Assume that there exists $ h \in L^2[0,T]$ such that $\int _0^{\varepsilon } |h(s)|\mathrm {d}s< +\infty $ for some $\varepsilon >0$ and $g(t,\cdot )=h(t-\cdot )$ for any $t \in [T,T+\Delta ]$. Then, the space $\mathscr {H}^{g,T}$ is the reproducing kernel Hilbert space for the process $( V^{g,T}_t)_{t\in [T, T+\Delta ]}$.

Proof

See “Appendix B.5”. $\square $

Theorem 5.3

For any $\gamma >0$, the sequence of stochastic processes $(\varepsilon ^{\gamma /2} V^{g,T})_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}[T,T+\Delta ]$ with speed $\varepsilon ^{-\gamma }$ and rate function $\Lambda ^V$, defined as

$$\begin{aligned} \Lambda ^V(\mathrm {x}) := {\left\{ \begin{array}{ll} \displaystyle \frac{1}{2}\Vert \mathrm {x}\Vert _{\mathscr {H}^{g,T}}^2, &{} \text {if }\,\,\mathrm {x}\in \mathscr {H}^{g,T},\\ + \infty , &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

(5.4)

Proof

Direct application of the generalised Schilder’s Theorem [17, Theorem 3.4.12]. $\square $

Remark 5.4

We now introduce a Borel subset of $\mathcal {C}[T,T+\Delta ]$, defined as $A := \{ g\in \mathcal {C}[T,T+\Delta ]:g(x)~\ge ~1 \text { for all } x\in \mathbb {R}\}$. Then, by a simple application of Theorem 5.3 and using that the rate function $\Lambda ^V$ is continuous on A, we can obtain then obtain the following tail behaviour of the process $V^{g,T}$:

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \varepsilon ^\gamma \log {\mathbb {P}} \left( V^{g,T}_t \ge \frac{1}{\varepsilon ^{\gamma /2}}\right) = - \inf _{g\in A} \Lambda ^V(g), \end{aligned}$$

(5.5)

for any $\gamma >0$ and $t\in [T,T+\Delta ]$.

Remark 5.5

Let us again fix the kernel g as the rough Bergomi kernel and denote the corresponding reproducing kernel Hilbert space by ${\mathscr {H}}^{\eta , \alpha , T}$ and the corresponding process $V^{g,T}$ as $V^{\eta ,\alpha ,T}$. If $\mathrm {x}\in {\mathscr {H}}^{\eta , \alpha , T}$ it follows that there exists $f\in L^2[0,T]$ such that $\mathrm {x}(t)~=~\int _0^T \eta \sqrt{2\alpha +1}(t-s)^\alpha f(s)\mathrm {d}s $ for all $t \in [T,T+\Delta ]$. Clearly, it follows that $\mathrm {x}\in {\mathscr {H}}^{a \eta , \alpha , T} $ for any $a>0$, as $f\in L^2[0,T]$ implies that $\frac{1}{a} f=:f_a \in L^2[0,T]$. We can compute the norm of $\mathrm {x}$ in each of these spaces to arrive at the following isometry:

$$\begin{aligned} \Vert \mathrm {x}\Vert ^2_{{\mathscr {H}}^{a \eta , \alpha , T}} = \Vert f_a \Vert ^2_{L^2[0,T]} = \frac{1}{a^2} \int _0^T f^2(s)\mathrm {d}s = \frac{1}{a^2} \Vert \mathrm {x}\Vert ^2_{{\mathscr {H}}^{ \eta , \alpha , T}}. \end{aligned}$$

(5.6)

We may now amalgamate (5.4), (5.5), and (5.6) to arrive at the following statement, which tells us how the large strike behaviour scales with the vol-of-vol parameter $\eta $ in the rough Bergomi model:

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \varepsilon ^\gamma \log P \left( V^{a\eta , \alpha , T}_t \ge {\frac{1}{\varepsilon ^{\gamma /2}}} \right) = \lim _{\varepsilon \downarrow 0} \varepsilon ^\gamma \log \left( P \left( V^{\eta , \alpha , T}_t \ge {\frac{1}{\varepsilon ^{\gamma /2}}} \right) ^{1/a^2}\right) \end{aligned}$$

(5.7)

Indeed, (5.7) tells us precisely how increasing the vol-of-vol parameter $\eta $ multiplicatively by a factor a in the rough Bergomi model increases the probability that the associated process $V^{g,T}$ will exceed a certain level.

Before stating the main theorem of this section, we first define the following rescaled process:

$$\begin{aligned} V^{g,T,\varepsilon }_t:= \varepsilon ^{\gamma /2}V^{g,T}_t, \quad {\widetilde{V}}^{g,T,\varepsilon }_t:= V^{g,T,\varepsilon }_t - \frac{\varepsilon ^\gamma }{2}\int _0^tg^2(t,u)\mathrm {d}u + \varepsilon ^{\gamma /2}, \end{aligned}$$

(5.8)

for $\varepsilon \in [0,1]$, $t \in [T,T+\Delta ]$. We also define the following $\mathcal {C}\left( [T,T+\Delta ] \times [0,1] \right) $ operators $\varphi _{1,\xi _0}, \varphi _2$, which map to $\mathcal {C}\left( [T,T+\Delta ] \times [0,1] \right) $ and $\mathcal {C}[0,1]$ respectively, as

$$\begin{aligned} (\varphi _{1,\xi _0}f)(s,\varepsilon ):= \xi _0(s)\exp (f(s,\varepsilon )), \quad (\varphi _2g)(\varepsilon ):=\frac{1}{\Delta } \int _T^{T+\Delta } g(s,\varepsilon )\mathrm {d}s. \end{aligned}$$

(5.9)

Note that in the definition of $\varphi _{1,\xi _0}$ in (5.9) we assume $\xi _0$ to be a continuous, single valued, and strictly positive function on $[T,T+\Delta ]$. This then implies that for every $s\in [T,T+\Delta ]$, the map $\varepsilon \mapsto (\varphi _{1,\xi _0} f)(s,\varepsilon )$ is a bijection and hence has an inverse, denoted by $\varphi ^{-1}_{1,\xi _0}$, which is defined as $ (\varphi ^{-1}_{1,\xi _0}f)(s,\varepsilon ):=\log \left( \frac{f(s,\varepsilon )}{\xi _0(s)} \right) $.

Theorem 5.6

For any $\gamma >0$, the sequence of rescaled VIX processes $(e^{\varepsilon ^{\gamma /2}}\text {VIX}_{T,\varepsilon ^{\gamma /2}})_{\varepsilon \in [0,1]} $ satisfies a pathwise large deviations principle on $\mathcal {C}[0,1]$ with speed $\varepsilon ^{-\gamma }$ and rate function

$$\begin{aligned} \Lambda ^{\text {VIX}}(\mathrm {x}):= \inf _{s\in [T,T+\Delta ]} \left\{ \Lambda ^V\left( \log \left( \frac{\mathrm {y}(s,\cdot )}{\xi _0(s)} \right) \right) : \mathrm {x}(\cdot ) = (\varphi _2\mathrm {y})(\cdot ) \right\} . \end{aligned}$$

Proof

See “Appendix B.6”. $\square $

Remark 5.7

Using Theorem 5.6, we can deduce the small-noise, large strike behaviour of VIX options. Indeed, for the Borel subset A of $\mathcal {C}[T,T+\Delta ]]$ introduced in Remark 5.4 we have that

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \varepsilon ^\gamma \log \mathbb {P}\left( VIX_{T, \varepsilon ^{\gamma /2}} \ge e^{-\varepsilon ^{\gamma /2}} \right) = - \inf _{g \in A} \Lambda ^{\text {VIX}}(g), \end{aligned}$$

for any $\gamma >0$.

6 Conclusions

In this paper we have characterised, for the first time, the small-time behaviour of options on integrated variance in rough volatility models, using large deviations theory. Our approach has a solid theoretical basis, with very convincing corresponding numerics, which agree with observed market phenomenon and the theoretical results attained by Alòs et al. [1]. Both the theoretical and the numerical results hold for each of the three rough volatility models presented, whose complexity increases. Any of the three, with our corresponding results, would be suitable for practical use; the user would simply chose the level of complexity needed to satisfy their individual needs. Note also that the theoretical results are widely applicable, and one could very easily adapt results presented in this paper to other models where the volatility process also satisfies a large deviations principle, and whose rate function can be computed easily and accurately.

Perhaps surprisingly, we have discovered that lognormal models such as rough Bergomi [7], 2 Factor Bergomi [5, 6] and mixed versions thereof, generate linear smiles around the money for options on realised variance in log-space. This is, at the very least, a property to be taken into account when modelling volatility derivatives and, to our knowledge, has never been addressed or commented on in previous works. Whether such an assumption is realistic or not, we have in addition provided an explicit way to construct a model that generates non-linear smiles should this be required or desired.

We have also proved a pathwise large deviations principle for rescaled VIX processes, in a fairly general setting with minimal assumptions on the kernel of the stochastic integral used to define the instantaneous variance; these results are then used to establish the small-noise, large strike asymptotic behaviour of the VIX. The current set up does not allow us to deduce the small-time VIX behaviour from the pathwise large deviations principle, but this would be a very interesting area for future research. Our numerical scheme would most likely give a good approximation for the rate function and corresponding small-time VIX smiles.

References

Alòs, E., García-Lorite, D., Muguruza, A.: On smile properties of volatility derivatives and exotic products: understanding the VIX skew (2018). arXiv:1808.03610
Alòs, E., León, J., Vives, J.: On the short-time behavior of the implied volatility for jump-diffusion models with stochastic volatility. Finance Stoch. 11(4), 571–589 (2007)
Article MathSciNet Google Scholar
Avellaneda, M., Papanicolau, A.: Statistics of VIX futures and their applications to trading volatility exchange-traded products. J. Invest. Strateg. 7(2), 1–33 (2018)
Google Scholar
Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York (1972)
MATH Google Scholar
Bergomi, L.: Smile dynamics III. Risk, pp. 90–96 (2008)
Bergomi, L.: Stochastic Volatility Modeling. Chapman & Hall/CRC Financial Mathematics Series (2016)
Bayer, C., Friz, P.K., Gatheral, J.: Pricing under rough volatility. Quant. Finance 16(6), 887–904 (2016)
Article MathSciNet Google Scholar
Bayer, C., Friz, P., Gulisashvili, A., Horvath, B., Stemper, B.: Short-time near the money skew in rough fractional stochastic volatility models. Quant. Finance 17(3), 779–798 (2017)
MATH Google Scholar
Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation, 2nd edn. Cambridge University Press, Cambridge (1989)
MATH Google Scholar
Biagini, F., Hu, Y., Øksendal, B., Zhang, T.: Stochastic Calculus for Fractional Brownian Motion and Applications. Springer, London (2008)
Book Google Scholar
Breeden, D.T., Litzenberger, R.H.: Prices of state-contingent claims implicit in options prices. J. Bus. 51, 621–651 (1978)
Article Google Scholar
Bennedsen, M., Lunde, A., Pakkanen, M.S.: Hybrid scheme for Brownian semistationary processes. Finance Stoch. 21(4), 931–965 (2017)
Article MathSciNet Google Scholar
Barndorff-Nielsen, O.E., Schmiegel, J.: Ambit processes: with applications to turbulence and tumour growth. In: Benth, F.E., Di Nunno, G., Lindstrøm, T., Øksendal, B., Zhang, T. (eds.) Stochastic Analysis and Applications, Volume 2 of Abel Symposium, pp. 93–124. Springer, Berlin (2007)
Carr, P., Geman, H., Madan, D., Yor, M.: Pricing options on realized variance. Finance Stoch. 9(4), 453–475 (2005)
Article MathSciNet Google Scholar
Cherny, A.: Brownian moving averages have conditional full support. Ann. Appl. Probab. 18(5), 1825–1850 (2008)
Article MathSciNet Google Scholar
Carr, P., Lee, R.: Volatility derivatives. Ann. Rev. Financ. Econ. 1(1), 319–339 (2009)
Article Google Scholar
Deuschel, J.D., Stroock, D.W.: Large Deviations. Academic Press, Boston (1989)
MATH Google Scholar
Dembo, A., Zeitouni, O.: Large Deviations Theory and Applications, 2nd edn. Springer, Berlin (2010)
Book Google Scholar
El Euch, O., Rosenbaum, M.: Perfect hedging under rough Heston models. Ann. Appl. Probab. 28(6), 3813–3856 (2018)
MathSciNet MATH Google Scholar
El Euch, O., Rosenbaum, M.: The characteristic function of rough Heston models. Math. Finance 29(1), 3–38 (2019)
Article MathSciNet Google Scholar
Friz, P., Gassiat, P., Pigato, P.: Precise asymptotics: robust stochastic volatility models (2018). Preprint available at arXiv:1811.00267
Forde, M., Zhang, H.: Asymptotics for rough stochastic volatility models. SIAM J. Finan. Math. 8(1), 114–145 (2017)
Article MathSciNet Google Scholar
Gassiat, P.: On the martingale property in the rough Bergomi model. Electron. Commun. Probab. 24(33) (2019)
Gatheral, J., Jaisson, T., Rosenbaum, M.: Volatility is rough. Quant. Finance 18(6), 933–949 (2018)
Article MathSciNet Google Scholar
Gao, K., Lee, R.: Asymptotics of implied volatility to arbitrary order. Finance Stoch. 18(2), 349–392 (2014)
Article MathSciNet Google Scholar
Horvath, B., Jacquier, A., Lacombe, C.: Asymptotic behaviour of randomised fractional volatility models. J. Appl. Probab. 56, 496–523 (2019). (Forthcoming)
Article MathSciNet Google Scholar
Horvath, B., Jacquier, A., Muguruza, A.: Functional central limit theorems for rough volatility (2017). arXiv:1711.03078
Horvath, B., Jacquier, A., Tankov, P.: Volatility options in rough volatility models. SIAM J. Finan. Math. 11(2), 437–469 (2020)
Article MathSciNet Google Scholar
Horvath, B., Muguruza, A., Tomas, M.: Deep learning volatility (2019). arXiv:1901.09647
Jacquier, A., Martini, C., Muguruza, A.: On VIX Futures in the rough Bergomi model. Quant. Finance 18(1), 45–61 (2018)
Article MathSciNet Google Scholar
Jacquier, A., Pakkanen, M., Stone, H.: Pathwise large deviations for the rough Bergomi model. J. Appl. Probab. 55(4), 1078–1092 (2018)
Article MathSciNet Google Scholar
McCrickerd, R., Pakkanen, M.: Turbocharging Monte Carlo pricing for the rough Bergomi model. Quant. Finance 18(11), 1877–1886 (2018)
Article MathSciNet Google Scholar
Neuberger, A.: The log contract. J. Portf. Manag. 20(2), 74–80 (1994)
Article Google Scholar
Pirjol, D., Zhu, L.: Short maturity Asian options in local volatility models. SIAM J. Financ. Math. 7, 947–992 (2016)
Article MathSciNet Google Scholar
Stone, H.: Calibrating rough volatility models: a convolutional neural network approach. Quant. Finance 20(3), 379–392 (2020)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Imperial College London, London, UK
Chloe Lacombe & Henry Stone
Department of Mathematics, Imperial College London and Synergis, London, UK
Aitor Muguruza

Authors

Chloe Lacombe
View author publications
You can also search for this author in PubMed Google Scholar
Aitor Muguruza
View author publications
You can also search for this author in PubMed Google Scholar
Henry Stone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aitor Muguruza.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors are grateful to Antoine Jacquier, Mikko Pakkanen and Ryan McCrickerd for stimulating discussions. AM and HS thank the EPSRC CDT in Financial Computing and Analytics for financial support.

Appendices

Appendix A: Approximating the density of realised variance in the mixed rough Bergomi model

In light of the numerical results shown in Sect. 4 (see Figs. 2, 3, 4, 5, 6) we identify a clear linear trend in the implied volatility smiles generated by both the rough Bergomi and mixed rough Bergomi models. Therefore, it is natural to postulate the following conjecture/approximation of log-linear smiles.

Assumption A.1

The implied volatility of realised variance options in the mixed rough Bergomi (2.5) model is linear in log-moneyness, and takes the following form:

$$\begin{aligned} {\hat{\sigma }}(K,T)=\left( T^\beta \left( a(\alpha ,\gamma ,\nu )+b(\alpha ,\gamma ,\nu )\log \left( \frac{K}{RV(v)(0)}\right) \right) \right) ^+ \end{aligned}$$

where

$$\begin{aligned} a(\alpha ,\gamma ,\nu )&=\frac{\sqrt{2\alpha +1}\sum _{i=1}^n \gamma _i \nu _i}{(\alpha +1)\sqrt{2\alpha + 3}},\\ b(\alpha ,\gamma ,\nu )&= \sqrt{2\alpha +1}\left( \frac{\sum _{i=1}^n \gamma _i \nu _i^2}{\sum _{i=1}^n \gamma _i \nu _i}{\mathcal {I}}(\alpha )(2\alpha +3)^{3/2}(\alpha +1)-\frac{\sum _{i=1}^n \gamma _i \nu _i}{(2\alpha +2)\sqrt{(2\alpha +3)}}\right) , \end{aligned}$$

with

$$\begin{aligned}&{\mathcal {I}}(\alpha )\\&\quad =\frac{\bigg (\displaystyle \sum \nolimits _{n=0}^{\infty }\frac{(\alpha )_n}{(\alpha +2)_n}\frac{1-2^{-2\alpha -3-n}}{2\alpha +3+n}+\sum \nolimits _{n=0}^{\infty }(-1)^n \frac{(-\alpha )_n (\alpha +1)}{(\alpha +2+n)n!} \frac{{\hat{F}}(n,1)-2^{n-1/2-n}{\hat{F}}(n,1/2)}{\alpha +1-n}\bigg )}{(\alpha +1)(4\alpha +5)} \end{aligned}$$

such that ${\hat{F}}(n,x)=\text { }_2F_1(-n-2\alpha -2,\alpha +1-n,\alpha +2-n,x)$ and $(x)_n=\displaystyle \prod \nolimits _{i=0}^{n-1}(x+i)$ represents the rising Pochhammer factorial.

Remark A.2

The values of the constants $a(\alpha ,\gamma ,\nu )$ and $b(\alpha ,\gamma ,\nu )$ in Assumption A.1, which give the level and slope of the implied volatility respectively, are given in [1, Example 24 and Example 27] respectively; we generalise to n factors. These results are given in terms of the Hurst parameter H; to avoid any confusion we will continue with our use of $\alpha $. Recall that, by Remark 2.2, $\alpha =H-1/2$.

Proposition A.3

Under Assumption A.1, the density of $RV(v^{(\gamma ,\nu )})(T)$ is given by

$$\begin{aligned} \psi _{RV}(x,T)=-\phi (d_2(x))\frac{\partial d_1(x)}{\partial x}\left( a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}d_1(x)+1\right) ,\quad x\ge 0 \end{aligned}$$

where $d_1(x)=\frac{\log (v_0)-\log (x)}{{\hat{\sigma }}(x,T)\sqrt{T}}+\frac{1}{2}{\hat{\sigma }}(x,T)\sqrt{T}$, $d_2(x)=d_1(x)- {\hat{\sigma }}(x,T)\sqrt{T}$ for $x\ge 0$ and $\phi (\cdot )$ is the standard Gaussian probability density function.

Proof

Let us denote

$$\begin{aligned} C(K,T):={\mathbb {E}}[(RV(v^{(\gamma ,\nu )})(T)-K)^+]. \end{aligned}$$

The well-known Breeden–Litzenberger formula [11] tells us that

$$\begin{aligned} \frac{\partial ^2 C(x,T)}{\partial x^2}\bigg |_{x=K}=\psi _{RV}(K,T). \end{aligned}$$

Under Assumption A.1, we have that

$$\begin{aligned} C(K,T)=BS(v_0,{\hat{\sigma }}(K,T),K,T) \end{aligned}$$

where $BS(v_0,\sigma ,K,T)=v_0\Phi (d_1)-K\Phi (d_2)$ is the Black-Scholes Call pricing formula with $\Phi $ the standard Gaussian cumulative distribution function. Then, differentiating C with respect to the strike gives

$$\begin{aligned} \frac{\partial C(x,T)}{\partial x}\bigg |_{x=K}=v_0\phi (d_1(K)) \frac{\partial d_1(x)}{\partial x}\bigg |_{x=K}-x\phi (d_2(K)) \frac{\partial d_2(x)}{\partial x}\bigg |_{x=K}-\Phi (d_2(K)) \end{aligned}$$

where

$$\begin{aligned} \frac{\partial d_1(x)}{\partial x}&=\frac{-{\hat{\sigma }}(x,T)+\log (x/v_0)a(\alpha ,\gamma ,\nu ) T^{\alpha }}{x{\hat{\sigma }}(x,T)^2\sqrt{T}}+\frac{1}{2}\frac{a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}}{x}\\&=\frac{-b(\alpha ,\gamma ,\nu ) T^{\alpha }}{x{\hat{\sigma }}(x,T)^2\sqrt{T}}+\frac{1}{2}\frac{a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}}{x}\end{aligned}$$

and

$$\begin{aligned} \frac{\partial d_2(x)}{\partial x}=\frac{\partial d_1(x)}{\partial x}-\frac{a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}}{x}. \end{aligned}$$

Using the well known identity $v_0\phi (d_1(x))=x \phi (d_2(x))$, proved in “Appendix E”, we further simplify

$$\begin{aligned} \frac{\partial C(K,T)}{\partial K}=v_0\phi (d_1(K))\left( \frac{a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}}{K}\right) -\Phi (d_2(K)). \end{aligned}$$

Differentiating again we obtain,

$$\begin{aligned} \psi _{RV}(K,T)= & {} -v_0\phi (d_1(K))\frac{a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}}{K}\left( d_1(K)\frac{\partial d_1(x)}{\partial x}\bigg |_{x=K}+\frac{1}{K}\right) \\&-\,\phi (d_2(K))\frac{\partial d_2(x)}{\partial x}\bigg |_{x=K}. \end{aligned}$$

Then, by using $v_0\phi (d_1(x))=x \phi (d_2(x))$, we find that

$$\begin{aligned} \psi _{RV}(K,T)=-\phi (d_2(K))\left( a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}\left( d_1(x)\frac{\partial d_1(x)}{\partial x}\bigg |_{x=K}+\frac{1}{K}\right) +\frac{\partial d_2(x)}{\partial x}\bigg |_{x=K}\right) , \end{aligned}$$

which we further simplify to

$$\begin{aligned} \psi _{RV}(K,T)=-\phi (d_2(K))\frac{\partial d_1(x)}{\partial x}\bigg |_{x=K}\left( a(\alpha ,\gamma ,\nu ) T^{\alpha +1/2}d_1(K)+1\right) , \end{aligned}$$

and the result then follows. Note that the density $\psi _{RV}(\cdot ,T)$ is indeed continuous for all $T>0$. $\square $

Remark A.4

Note that Proposition A.3 gives the density of $RV(v^{(\gamma ,\nu )})(T)$ in closed-form. In addition, Proposition A.3 can be easily used to get the density of the Arithmetic Asian option under the Black–Scholes model. This would correspond to the case $\alpha =0$ and $\nu =\sigma >0$ as the Black–Scholes constant volatility.

Remark A.5

Assuming the density $\psi _{RV}$ exists, we have the following volatility swap price:

$$\begin{aligned} {\mathbb {E}}[\sqrt{RV(v^{(\gamma ,\nu )})(T)}] =\int _0^\infty \sqrt{x}\psi _{RV}(x,T)\mathrm {d}x. \end{aligned}$$

In Fig. 9, we provide numerical results for the volatility swap approximation, which performs best for short maturities, due to the nature of the approximation being motivated by small-time smile behaviour. Interestingly, it captures rather accurately the short time decay of the Volatility Swap price for maturities less than 3 months; for larger maturities the absolute error does not exceed 20 basis points.

Appendix B: Proof of main results

1.1 B.1: Proof of Theorem 3.3

Proof

For $t\in \mathcal {T}$, $\varepsilon >0$, we first define the rescaled processes

$$\begin{aligned} \begin{aligned} Z^{\varepsilon }_t&:= \varepsilon ^{\beta /2}Z_t \overset{\text {d}}{=} Z_{\varepsilon t}, \\ v^{\varepsilon }_t&:= v_0 \exp \left( Z^{\varepsilon }_t -\frac{\eta ^2}{2} (\varepsilon t)^{\beta } \right) ,&\end{aligned} \end{aligned}$$

(B.1)

where $\beta := 2 \alpha + 1 \in (0,1)$. From Schilder’s Theorem [17, Theorem 3.4.12 ] and Proposition 3.2, we have that the sequence of processes $(Z^{\varepsilon })_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^Z$. We now prove that the two sequences of stochastic processes $Z^\varepsilon $ and ${\widetilde{Z}}^\varepsilon ~:~=~ Z^\varepsilon - \frac{\eta ^2}{2}(\varepsilon \cdot )^\beta $ are exponentially equivalent [18, Definition 4.2.10]. For each $\delta >0$ and $t \in \mathcal {T}$, there exists $\varepsilon _* ~:~=~ \frac{1}{t} \left( \frac{2\delta }{\eta ^2}\right) ^{1/\beta }>0$ such that

$$\begin{aligned} \sup _{t\in \mathcal {T}} \vert Z^\varepsilon _t - {\widetilde{Z}}_t^\varepsilon | = \sup _{t\in \mathcal {T}} \vert \frac{\eta ^2}{2} (\varepsilon t )^\beta \vert \le \delta , \end{aligned}$$

for all $0<\varepsilon <\varepsilon _*$. Therefore, for all $\delta >0$, $\limsup _{\varepsilon \downarrow 0} \varepsilon ^\beta \log \mathbb {P}(\Vert Z^\varepsilon - {\widetilde{Z}}^\varepsilon \Vert _{\infty }>\delta ) = - \infty $, and the two processes are indeed exponentially equivalent. Then, using [18, Theorem 4.2.13], the sequence of stochastic processes $({\widetilde{Z}}^\varepsilon )_{\varepsilon >0 } $ also satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$, with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^Z$. Moreover, for all $\varepsilon ,t$, we have that $v^{\varepsilon }_t = v_0 \exp ({\widetilde{Z}}^\varepsilon _t)$, where the bijective transformation $x~\mapsto ~v_0\exp (x) $ is clearly continuous with respect to the sup norm metric. Therefore we can apply the Contraction Principle [18, Theorem 4.2.1], which is stated in “Appendix D”, concluding that the sequence of processes $(v^{\varepsilon })_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^Z\left( \log \left( \frac{x}{v_0} \right) \right) $. Here we have used that, for each $\varepsilon >0$, $t\in \mathcal {T}$ and $\mathrm {x}\in \mathcal {C}(\mathcal {T})$, the inverse mapping of the bijection transformation $\mathrm {x}(t,\varepsilon )\mapsto v_0\exp (x) $ is given by $\log \left( \frac{x}{v_0} \right) $. Since, For all $\varepsilon > 0 $ $\sup _{t \in \mathcal {T}} \left| v^\varepsilon _t - v_{\varepsilon t} \right| = 0 $, which implies that for any $\delta > 0$, $\limsup _{\varepsilon \downarrow 0} \varepsilon ^{\beta } \log {\mathbb {P}} ( \Vert v^\varepsilon _\cdot - v_{(\varepsilon \cdot )} \Vert _{\infty } > \delta ) = -\infty $ trivially. Thus the sequence of processes $(v^\varepsilon _t)_{t \in \mathcal {T}}$ and $(v_{\varepsilon t })_{t \in \mathcal {T}})$ are exponentially equivalent and therefore satisfy the same LDP. Notice also that $\Lambda ^v(v_0)=\Lambda ^Z(0)=\Vert 0 \Vert ^2_{{\mathscr {H}}^{K_\alpha } }=0.$ $\square $

1.2 B.2: Proof of Corollary 3.7

Proof

The proof of Eq. (3.4) is similar to the proof of [22, Corollary 4.9], and we shall prove the lower and upper bound separately, which turn out to be equal. Firstly, as the rate function ${\hat{\Lambda }}^v$ is continuous on $\mathcal {C}(\mathcal {T})$, we have that, for all $k>0$,

$$\begin{aligned} \lim _{t \downarrow 0} t^\beta \log \mathbb {P}(\log [RV(v)(t)]>k) = -\mathrm {I}(k), \end{aligned}$$

as an application of Corollary 3.4.

(1)
The proof of the lower bound is exactly the same as presented in [22, Appendix C] and will be omitted here; we arrive at $\liminf _{t \downarrow 0} t^{\beta }\log {\mathbb {E}} \left[ (RV(v)(t)-\mathrm {e}^k)^+\right] \ge -\mathrm {I}(k). $
(2)
To establish the upper bound, We closely follow [34] :

First we prove that
$$\begin{aligned} \lim _{t \downarrow 0} t^{\beta } \log {\mathbb {E}} [(RV(v)(t)-\mathrm {e}^k)]^+ = \lim _{t \downarrow 0} t^{\beta } \log \mathbb {P}(RV(v)(t)\ge \mathrm {e}^k). \end{aligned}$$
(B.2)
We apply Hölder’s inequality:
$$\begin{aligned}&{\mathbb {E}} [(RV(v)(t)-\mathrm {e}^k)]^+ = {\mathbb {E}} \left[ (RV(v)(t)-\mathrm {e}^k) 1{1}_{\{RV(v)(t)\ge \mathrm {e}^k\}} \right] \\&\quad \le {\mathbb {E}} \left[ (RV(v)(t)-\mathrm {e}^k)^q \right] ^{1/q} {\mathbb {P}(RV(v)(t)\ge \mathrm {e}^k)}^{1-1/q}, \end{aligned}$$
which holds for all $q>1$. We note that for all $q\ge 2$, the mapping $x\mapsto x^q$ for any $x\ge 0$ is convex, therefore by Hölder’s inequality one obtains $\left( x+y\right) ^q\le \frac{x^q +y^q}{2^{1-q}}$ for any $x,y \ge 0$, which in turn implies
$$\begin{aligned} {\mathbb {E}} \left[ |RV(v)(t)-\mathrm {e}^k|^q \right] \le 2^{q-1}\left[ {\mathbb {E}} \left[ (RV(v)(t))^q\right] +(\mathrm {e}^k)^q \right] . \end{aligned}$$
(B.3)
We further obtain the following inequality by applying Jensen’s inequality and the fact that all moments exist for $(RV(v)(t))^q$
$$\begin{aligned} {\mathbb {E}} \left[ (RV(v)(t))^q\right]\le & {} \frac{1}{t^q} \int _0^t {\mathbb {E}}[v^q_s] \mathrm {d}s\le \frac{v^q_0}{t^q} \int _0^t \exp \left( \left( \frac{q^2\eta ^2}{2} - \frac{q\eta ^2}{2}\right) s^{2\alpha +1}\right) \mathrm {d}s\nonumber \\\le & {} \frac{v^q_0}{t^{q-1}} \exp \left( \left( \frac{q^2\eta ^2}{2} - \frac{q\eta ^2}{2}\right) t^{2\alpha +1}\right) \end{aligned}$$
(B.4)
Therefore, using (B.3) and (B.4) we obtain an upper bound
$$\begin{aligned} \limsup _{t \downarrow 0} t^{\beta } \log {\mathbb {E}} [(RV(v)(t)-\mathrm {e}^k)^+ ] \le \limsup _{t \downarrow 0} (1-1/q) t^{\beta } \log \mathbb {P}(RV(v)(t)\ge \mathrm {e}^k), \end{aligned}$$
since it holds for $(1-1/q)< 1$. To obtain a lower bound, we have for any $\varepsilon >0$
$$\begin{aligned} {\mathbb {E}} [(RV(v)(t)-\mathrm {e}^k)]^+\ge & {} {\mathbb {E}} \left[ (RV(v)(t)-\mathrm {e}^k)1{1}_{\{RV(v)(t)\ge \mathrm {e}^k+\varepsilon \}}\right] \\\ge & {} \varepsilon \mathbb {P}\left( RV(v)(t)\ge \mathrm {e}^k+\varepsilon \right) , \end{aligned}$$
which implies
$$\begin{aligned} \liminf _{t \downarrow 0} t^{\beta } \log {\mathbb {E}} [(RV(v)(t)-\mathrm {e}^k)^+] \ge \liminf _{t \downarrow 0} t^{\beta } \log \mathbb {P}\left( RV(v)(t)\ge \mathrm {e}^k+\varepsilon \right) . \end{aligned}$$
Now, using (B.2) leads to $\limsup _{t \downarrow 0} t^\beta \log {\mathbb {E}} \left[ (RV(v)(t)-\mathrm {e}^k)^+\right] \le -\mathrm {I}(k)$. The conclusion for Eq. (3.4) then follows directly.

The proof of Eq. (3.5) follows the same steps, after proving that the process $\sqrt{RV(v)}$ satisfies a large deviations principle on $\mathbb {R}_+$. Indeed, as the function $x \mapsto x^2$ is a continuous bijection on $\mathbb {R}_+$, we have that the square root of the integrated variance process $\sqrt{RV(v)}$ satisfies a large deviations principle on $\mathbb {R}_+$ as t tends to zero, with speed $t^{-\beta }$ and rate function ${\hat{\Lambda }}^v((\cdot )^2)$, using [18, Theorem 4.2.4]. $\square $

1.3 B.3: Proof of Theorem 3.10

Proof

For brevity we set $n=2$, but for larger n, identical arguments can be applied. From Schilder’s Theorem [17, Theorem 3.4.12] and Proposition 3.2, we have that the sequence of processes $(Z^{\varepsilon })_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^Z$. Define the operator $f:\mathcal {C}(\mathcal {T})\rightarrow \mathcal {C}((\mathcal {T}),\mathbb {R}^2)$ by $f(\mathrm {x}):=(\frac{\nu _1}{\eta } \mathrm {x}, \frac{\nu _2}{\eta }\mathrm {x})$, which is clearly continuous with respect to the sup-norm $\Vert \cdot \Vert _{\infty }$ on $\mathcal {C}(\mathcal {T},\mathbb {R}^2)$. Applying the Contraction Principle then yields that the sequence of two-dimensional processes $((\frac{\nu _1}{\eta }Z^\varepsilon , \frac{\nu _2}{\eta }Z^\varepsilon ))_{\varepsilon >0}$ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T},\mathbb {R}^2)$ as $\varepsilon $ tends to zero with speed $\varepsilon ^{-\beta }$ and rate function

$$\begin{aligned} {\tilde{\Lambda }}(\mathrm {y},\mathrm {z}):=\inf \{\Lambda ^Z(\mathrm {x}): f(\mathrm {x})=(\mathrm {y},\mathrm {z}) \}=\inf \left\{ \Lambda ^Z\left( \frac{\eta }{\nu _1}\mathrm {y}\right) : \mathrm {z}=\frac{\nu _2}{\nu _1}\mathrm {y}\right\} . \end{aligned}$$

Identical arguments to the proof of Theorem 3.3 give that the sequences of processes $((\frac{\nu _1}{\eta }Z^\varepsilon , \frac{\nu _2}{\eta }Z^\varepsilon ))_{\varepsilon >0}$ and $((\frac{\nu _1}{\eta }Z^\varepsilon -\frac{\nu _1^2}{2}(\varepsilon \cdot )^\beta , \frac{\nu _2}{\eta }Z^\varepsilon -\frac{\nu ^2}{2}(\varepsilon \cdot )^\beta ) )_{\varepsilon >0}$ are exponentially equivalent, thus satisfy the same large deviations principle, with the same rate function and the same speed.

We now define the operator $g^\gamma :\mathcal {C}(\mathcal {T},\mathbb {R}^2)\rightarrow \mathcal {C}(\mathcal {T})$ as $g^\gamma (\mathrm {x},\mathrm {y})=v_0(\gamma e^\mathrm {x}+ (1-\gamma )e^\mathrm {y}) $. For small perturbations $\delta ^\mathrm {x}, \delta ^\mathrm {y}\in \mathcal {C}(\mathcal {T})$ we have that

$$\begin{aligned}&\sup _{t\in \mathcal {T}} \vert g^\gamma (\mathrm {x}+\delta ^\mathrm {x}, \mathrm {y}+\delta ^\mathrm {y})-g^\gamma (\mathrm {x},\mathrm {y}) \vert \\&\quad \le \vert v_0 \vert \left( \sup _{t\in \mathcal {T}} \vert \gamma e^{\mathrm {x}(t)}(e^{\delta ^\mathrm {x}(t)}-1) \vert + \sup _{t\in \mathcal {T}} \vert (1-\gamma ) e^{\mathrm {y}(t)}(e^{\delta ^\mathrm {y}(t)}-1) \vert \right) . \end{aligned}$$

Clearly the right hand side tends to zero as $\delta ^\mathrm {x},\delta ^\mathrm {y}$ tends to zero; thus the operator $g^\gamma $ is continuous with respect to the sup-norm $\Vert \cdot \Vert _{\infty }$ on $\mathcal {C}(\mathcal {T})$. Applying the Contraction Principle then yields that the sequence of processes $ (v^{(\varepsilon ,\gamma ,\nu )})_{\varepsilon>0}:= \Big (v_0 \Big ( \gamma \exp (\frac{\nu _1}{\eta }Z^\varepsilon -\frac{\nu _1^2}{2}(\varepsilon \cdot )^\beta ) +(1-\gamma )\exp (\frac{\nu _2}{\eta }Z^\varepsilon -\frac{\nu _2^2}{2}(\varepsilon \cdot )^\beta ) \Big )\Big )_{\varepsilon >0} $ satisfies a large deviations principle on $\mathcal {C}(\mathcal {T})$ as $\varepsilon $ tends to zero, with speed $\varepsilon ^{-\beta }$ and rate function

$$\begin{aligned}&\mathrm {x}\mapsto \inf \{{\tilde{\Lambda }}(\mathrm {y},\mathrm {z}):\mathrm {x}=g^\gamma (\mathrm {y},\mathrm {z}) \}\\&\quad =\inf \{\Lambda ^Z(\frac{\eta }{\nu _1}\mathrm {y}):\mathrm {x}=g^\gamma (\mathrm {y},\frac{\nu _2}{\nu _1}\mathrm {y}) \} \\&\quad = \inf \{\Lambda ^Z(\frac{\eta }{\nu _1}\mathrm {y}):\mathrm {x}=v_0 (\gamma e^\mathrm {y}+(1-\gamma )e^{\frac{\nu _2}{\nu _1}\mathrm {y}}) \}. \end{aligned}$$

Since, for all $\varepsilon >0$ and $t\in \mathcal {T}$, $v^{(\gamma ,\nu )}_{\varepsilon t}$ and $v^{(\varepsilon ,\gamma ,\nu )}_t$ are equal, the theorem follows immediately. Identical arguments to the proof of Theorem 3.3 then yield that $\Lambda ^\gamma (v_0)=0$. $\square $

1.4 B.4: Proof of Theorem 3.14

Proof

We begin by introducing a small-time rescaling of (2.6) for $\varepsilon >0$, so that the system becomes

$$\begin{aligned} v^{(\gamma , \nu , \Sigma ,\varepsilon )}_t := v^{(\gamma , \nu , \Sigma )}_{\varepsilon t} = v_0 \sum _{i=1}^n \gamma _i {\mathcal {E}}\left( \frac{\nu ^i}{\eta } \cdot \mathrm {L}_i {\mathcal {Z}}^{\varepsilon }_t \right) , \end{aligned}$$

(B.5)

with the rescaled process ${\mathcal {Z}}^{\varepsilon }_t$ defined as ${\mathcal {Z}}^{\varepsilon }_t := {\mathcal {Z}}_{\varepsilon t} = \varepsilon ^{\alpha +\frac{1}{2}} \Big (\int _0^t K_\alpha (s,t)\mathrm {d}W^1_s, \ldots , \int _0^t K_\alpha (s,t)\mathrm {d}W^m_s \Big )$.

The m-dimensional sequence of processes $(\varepsilon ^{\beta /2}(W^1,\cdots ,W^m))_{\varepsilon >0}$ satisfies a large deviations principle on ${\mathcal {C}}(\mathcal {T},\mathbb {R}^m)$ as $\varepsilon $ goes to zero with speed $\varepsilon ^{-\beta }$ and rate function $\Lambda ^m$ defined by $\Lambda ^m(\mathrm {x}) := \frac{1}{2} \left\| \mathrm {x} \right\| ^2_2$ for $\mathrm {x}\in {\mathscr {H}}_m$ and $+\infty $ otherwise, by Schilder’s Theorem [17, Theorem 3.4.12 ]. ${\mathscr {H}}_m$ is the reproducing kernel Hilbert space of the measure induced by $(W_1,\cdots ,W_m)$ on ${\mathcal {C}}(\mathcal {T},\mathbb {R}^m)$, defined as

$$\begin{aligned} {\mathscr {H}}_m := \left\{ (g^1,\cdots , g^m)\in {\mathcal {C}}(\mathcal {T},\mathbb {R}^m): g^i (t) = \int _0^t f^i(s) \mathrm {d}s, f^i\in L^2(\mathcal {T}) \text { for all } i\in 1\cdots m \right\} . \end{aligned}$$

Then, using an extension of the proof of [26, Theorem 3.6], for $i=1,\cdots , n$, the sequence of m-dimensional processes $\left( \mathrm {L}_i {\mathcal {Z}}^\varepsilon _\cdot \right) _{\varepsilon >0}$ satisfies a large deviations principle on ${\mathcal {C}}(\mathcal {T},\mathbb {R}^m)$ as $\varepsilon $ tends to zero with speed $\varepsilon ^{-\beta }$ and rate function $ \mathrm {y}\mapsto \inf \left\{ \Lambda ^m(\mathrm {x}): \mathrm {x}\in {\mathscr {H}}_m, \mathrm {y} = \mathrm {L}_i \mathrm {x}\right\} $ with $\mathrm {L}_i$ the lower triangular matrix introduced in Model (2.6). Consequently, for $i=1,\cdots , n$ each (one-dimensional) sequence of processes $\left( \frac{\nu ^i}{\eta }\cdot \mathrm {L}_i {\mathcal {Z}}^\varepsilon _\cdot \right) _{\varepsilon >0}$ also satisfies a large deviations principle as $\varepsilon $ tends to zero, with speed $\varepsilon ^{-\beta }$ and rate function $ \Lambda _{\Sigma _i} (\mathrm {y}) := \inf \left\{ \Lambda ^m(\mathrm {x}): \mathrm {x}\in {\mathscr {H}}_m, \mathrm {y} = \frac{\nu ^i}{\eta }\cdot \mathrm {L}_i \mathrm {x}\right\} $.

Analogously to Theorem 3.3, each sequence of processes $\left( \frac{\nu ^i}{\eta } \cdot \mathrm {L}_i {\mathcal {Z}}^\varepsilon _\cdot \right) _{\varepsilon >0}$ and $\Big (\frac{\nu ^i}{\eta }\cdot \mathrm {L}_i {\mathcal {Z}}^\varepsilon _\cdot - \frac{1}{2} \nu ^i \Sigma _i \nu ^i (\varepsilon \cdot )^\beta \Big )_{\varepsilon >0} $ are exponential equivalent for $i=1\cdots n$ ; therefore they satisfy the same large deviations principle with the same speed $\varepsilon ^{-\beta }$ and the same rate function $\Lambda _{\Sigma _i}$.

We now define the operator $g^\gamma : {\mathcal {C}}(\mathcal {T}, \mathbb {R}^n) \rightarrow {\mathcal {C}}(\mathcal {T})$ as

$$\begin{aligned} g^\gamma (\mathrm {x})(\cdot ) := v_0 \sum _{i=1}^n \gamma _i \exp \left( \frac{\nu ^i}{\eta }\cdot \mathrm {x}(\cdot ) \right) , \end{aligned}$$

with $\mathrm {x}:=(\mathrm {x}_1,\cdots ,\mathrm {x}_n)$. For small perturbations $\delta ^1,\cdots , \delta ^n \in {\mathcal {C}}(\mathcal {T})$ with $\delta := (\delta ^1,\cdots , \delta ^n)$, we have that

$$\begin{aligned} \sup _{t\in \mathcal {T}} \left| g^\gamma (\mathrm {x}+\delta )(t)-g^\gamma (\mathrm {x})(t) \right|&= \sup _{t\in \mathcal {T}} \left| v_0 \sum _{i=1}^n \gamma _i \exp (\frac{\nu ^i}{\eta }\cdot (\mathrm {x}(t)+\delta (t)) ) -\exp (\frac{\nu ^i}{\eta }\cdot \mathrm {x}(t)) \right| \\&\le \sup _{t\in \mathcal {T}} \vert v_0 \vert \sum _{i=1}^n \left| \exp ( \frac{\nu ^i}{\eta }\cdot \mathrm {x}(t)) (\exp (\delta (t))-1) \right| \end{aligned}$$

The right-hand side tends to zero as $\delta ^1,\cdots ,\delta ^n$ tends to zero; thus the operator $g^\gamma $ is continuous with respect to the sup-norm $\left\| \cdot \right\| _{\infty }$ on ${\mathcal {C}}(\mathcal {T})$. Using that $v^{(\gamma , \nu , \Sigma ,\varepsilon )}_t= g^\gamma (\frac{\nu ^i}{\eta }\cdot \mathrm {L}_i {\mathcal {Z}}^\varepsilon _\cdot - \frac{1}{2} \nu ^i \Sigma _i \nu ^i (\varepsilon \cdot )^\beta )(t) $ for each $\varepsilon >0$ and $t \in \mathcal {T}$, we can apply the Contraction Principle then yields that the sequence of processes $(v^{(\gamma , \nu , \Sigma ,\varepsilon )})_{\varepsilon >0}$ satisfies a large deviations principle on ${\mathcal {C}}(\mathcal {T})$ as $\varepsilon $ tends to zero, with speed $\varepsilon ^{-\beta }$ and rate function

$$\begin{aligned}&\mathrm {y}\mapsto \inf \left\{ \Lambda _{\Sigma _i} (\mathrm {x}) : \mathrm {y} = v_0 \sum _{i=1}^n \gamma _i \exp \left( \frac{\nu ^i}{\eta }\cdot \mathrm {x} \right) \right\} \\&\quad = \inf \left\{ \Lambda ^m(\mathrm {x}) : \mathrm {x} \in {\mathcal {H}}_m, \mathrm {y} = v_0 \sum _{i=1}^n \gamma _i \exp \left( \frac{\nu ^i}{\eta } \cdot \mathrm {L}_i \mathrm {x}\right) \right\} . \end{aligned}$$

As with the previous two models we have that, for all $\varepsilon >0$ and $t\in \mathcal {T}$, $v^{(\gamma , \nu , \Sigma ,\varepsilon )}_t$ and $ v^{(\gamma , \nu , \Sigma )}_{\varepsilon t}$ are equal in law and so the result follows directly. $\square $

1.5 B.5: Proof of Proposition 5.2

Proof

We reall that that the inner product $ \langle \mathcal {I}^{g,T}f_1, \mathcal {I}^{g,T}f_2 \rangle _{\mathscr {H}^{g,T}}:= \langle f_1, f_2, \rangle _{L^2[0,T]}, $ where

$$\begin{aligned} \mathcal {I}^{g,T}f(\cdot )= \int _0^T g(\cdot ,s) f(s) \mathrm {d}s, \quad \mathscr {H}^{g,T}= \{\mathcal {I}^{g,T}f: f\in L^2[0,T] \}. \end{aligned}$$

The proof of Proposition 5.2, which is similar to the proofs given in [31], is made up of three parts. The first part is to prove that $ \left( \mathscr {H}^{g,T}, \langle \cdot , \cdot \rangle _{\mathscr {H}^{g,T}} \right) $ is a separable Hilbert space. Clearly $\mathcal {I}^{g,T}$ is surjective on $\mathscr {H}^{g,T}$. Now take $f_1,f_2 \in L^2[0,T] $ such that $\mathcal {I}^{g,T}f_1 = \mathcal {I}^{g,T}f_2$. For any $t\in [T,T+\Delta ]$ it follows that $\int _0^T g(t,s)[f_1(s)-f_2(s)]\mathrm {d}s = 0 $; applying the Titchmarsh convolution theorem then implies that $f_1=f_2$ almost everywhere and so $\mathcal {I}^{g,T}:L^2[0,T]\rightarrow \mathscr {H}^{g,T}$ is a bijection. $\mathcal {I}^{g,T}$ is a linear operator, and therefore $ \langle \cdot , \cdot \rangle _{\mathscr {H}^{g,T}} $ is indeed an inner product; hence $ \left( \mathscr {H}^{g,T}, \langle \cdot , \cdot \rangle _{\mathscr {H}^{g,T}} \right) $ is a real inner product space. Since $L^2[0,T]$ is a complete (Hilbert) space, for every Cauchy sequence of functions $\{ f_n \}_{n \in {\mathbb {N}} }\in L^2[0,T] $, there exists a function ${\widetilde{f}} \in L^2[0,T]$ such that $\{ f_n \}_{n \in {\mathbb {N}} }$ converges to ${\widetilde{f}} \in L^2[0,T]$; note also that $\{\mathcal {I}^{g,T}f_n\}_{n\in {\mathbb {N}}}$ converges to $\mathcal {I}^{g,T}{\widetilde{f}}$ in $\mathscr {H}^{g,T}$. Assume for a contradiction that there exists $f \in L^2[0,T]$ such that $f \ne {\widetilde{f}}$ and $\{\mathcal {I}^{g,T}f_n\}_{n\in {\mathbb {N}}}$ converges to $\mathcal {I}^{g,T}f$ in $\mathscr {H}^{g,T}$, then, since $\mathcal {I}^{g,T}$ is a bijection, the triangle inequality yields

$$\begin{aligned} 0 < \left\| \mathcal {I}^{g,T}f - \mathcal {I}^{g,T}{\widetilde{f}} \right\| _{\mathscr {H}^{g,T}} \le \left\| \mathcal {I}^{g,T}f - \mathcal {I}^{g,T}f_n \right\| _{\mathscr {H}^{g,T}} +\left\| \mathcal {I}^{g,T}{\widetilde{f}} - \mathcal {I}^{g,T}f_n\right\| _{\mathscr {H}^{g,T}}, \end{aligned}$$

which converges to zero as n tends to infinity. Therefore $f={\widetilde{f}}$, $\mathcal {I}^{g,T}f \in \mathscr {H}^{g,T}$ and $\mathscr {H}^{g,T}$ is complete, hence a real Hilbert space. Since $L^2[0,T]$ is separable with countable orthonormal basis $\{\phi _n\}_{n \in {\mathbb {N}}}$, then $\{\mathcal {I}^{g,T}\phi _n\}_{n \in {\mathbb {N}}}$ is an orthonormal basis for $\mathscr {H}^{g,T}$, which is then separable.

The second part of the proof is to show that there exists a dense embedding $\iota : \mathscr {H}^{g,T}\rightarrow \mathcal {C}[T,T+\Delta ]$. Since there exists $ h \in L^2[0,T]$ such that $\int _0^{\varepsilon } |h(s)|\mathrm {d}s$ for all $\varepsilon >0$ and $g(t,\cdot )=h(t-\cdot )$ for any $t \in [T,T+\Delta ]$, we can apply by [15, Lemma 2.1], which tells us that $\mathscr {H}^{g,T}$ is dense in $ \mathcal {C}[T,T+\Delta ]$ and so we choose the embedding to be the inclusion map.

Finally we must prove that every $f^* \in \mathcal {C}[T,T+\Delta ]^*$, where $f^*$ is defined in [31, Definition 3.1], is Gaussian on $ \mathcal {C}[T,T+\Delta ]$, with variance $\Vert \iota ^* f^* \Vert _{{\mathscr {H}^{g,T}}^*} $, where $\iota ^*$ is the dual of $\iota $. Take $f^* \in \mathcal {C}[T,T+\Delta ]^*$, then by the fact that $(\mathcal {C}[T,T+\Delta ],{\mathcal {H}}^{g,T},\mu )$ is a RKHS triplet by similar arguments to [22, Lemma 3.1], we obtain using [31, Remark 3.6] that $\iota ^*$ admits an isometric embedding $ {\overline{\iota }}^*$ such that

$$\begin{aligned} \Vert {\overline{\iota }}^* f^* \Vert _{{\mathscr {H}^{g,T}}^*} = \Vert f^* \Vert _{L^2(\mathcal {C}[T,T+\Delta ] ,\mu )} = \int _{\mathcal {C}[T,T+\Delta ]} \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! (f^*)^2\mathrm {d}\mu = \text {VAR}(f^*), \end{aligned}$$

where $\mu $ is the Gaussian measure induced by the process $\int _0^T g(\cdot ,s) \mathrm {d}s $ on $\Big ( \mathcal {C}[T,T+\Delta ], {\mathscr {B}}(\mathcal {C}[T,T+\Delta ]) \Big ).$ $\square $

1.6 B.6: Proof of Theorem 5.6

Proof

First we recall ${\widetilde{V}}^{g,T,\varepsilon }_t:= V^{g,T,\varepsilon }_t - \frac{\varepsilon ^\gamma }{2}\int _0^tg^2(t,u)\mathrm {d}u + \varepsilon ^{\gamma /2}$. We begin the proof by showing that the sequence of processes $(V^{g,T,\varepsilon })_{\varepsilon \in [0,1]}$ and $({\widetilde{V}}^{g,T,\varepsilon })_{\varepsilon \in [0,1]} $ are exponentially equivalent [18, Definition 4.2.10]. As $g(t,\cdot ) \in L^2[0,T]$ for all $t\in [T,T+\Delta ]$, for each $\delta >0$ there exists $\varepsilon _*>0$ such that $ \sup _{t\in [T,T+\Delta ]} \left| \varepsilon _*^{\gamma /2} - \frac{\varepsilon _*^\gamma }{2}\int _0^T g^2(t,u)\mathrm {d}u \right| \le \delta . $ Therefore, for the $\mathcal {C}[T,T+\Delta ]$ norm $\Vert \cdot \Vert _{\infty }$ we have that for all $\varepsilon _*>\varepsilon >0$,

$$\begin{aligned} \mathbb {P}\left( \left\| V^{g,T,\varepsilon } -{\widetilde{V}}^{g,T,\varepsilon } \right\| _{\infty }> \delta \right) = \mathbb {P}\left( \sup _{t\in [T,T+\Delta ]} \left| \varepsilon ^{\gamma /2} - \frac{\varepsilon ^\gamma }{2}\int _0^T g^2(t,u)\mathrm {d}u \right| > \delta \right) =0. \end{aligned}$$

Therefore $\limsup _{\varepsilon \downarrow 0} \varepsilon ^\gamma \log \mathbb {P}\left( \left\| V^{g,T,\varepsilon } -{\widetilde{V}}^{g,T,\varepsilon } \right\| _{\infty } > \delta \right) = - \infty $, and so the two sequences of processes $(V^{g,T,\varepsilon })_{\varepsilon \in [0,1]}$ and $({\widetilde{V}}^{g,T,\varepsilon })_{\varepsilon \in [0,1]} $ are exponentially equivalent; applying [18, Theorem 4.2.13] then yields that $({\widetilde{V}}^{g,T,\varepsilon })_{\varepsilon \in [0,1]} $ satisfies a large deviations principle on $\mathcal {C}[T,T+\Delta ]$ with speed $\varepsilon ^{-\gamma }$ and rate function $\Lambda ^V$.

We now prove that the operators $\varphi _{1,\xi _0}$ and $\varphi _2$ are continuous with respect to the $\mathcal {C}([T,T+\Delta ]\times [0,1]) $ and $\mathcal {C}[0,1]$ $\Vert \cdot \Vert _{\infty } $ norms respectively. The proofs are very simple, and are included for completeness. First let us take a small perturbation $\delta ^f \in \mathcal {C}([T,T+\Delta ]\times [0,1])$:

$$\begin{aligned} \left\| \varphi _{1,\xi _0}(f+\delta ^f) - \varphi _{1,\xi _0}(f) \right\| _{\infty }&= \sup _{\begin{array}{c} \varepsilon \in [0,1] \\ s \in [T,T+\Delta ] \end{array}} \left| \xi _0(s)e^{f(s,\varepsilon )}\left( e^{\delta ^f(s,\varepsilon )}-1\right) \right| \\&\le \sup _{\begin{array}{c} \varepsilon \in [0,1] \\ s \in [T,T+\Delta ] \end{array}} \vert \xi _0(s) \vert \sup _{\begin{array}{c} \varepsilon \in [0,1] \\ s \in [T,T+\Delta ] \end{array}} \vert e^{f(s,\varepsilon )} \vert \sup _{\begin{array}{c} \varepsilon \in [0,1] \\ s \in [T,T+\Delta ] \end{array}} \vert e^{\delta ^f(s,\varepsilon )}-1 \vert . \end{aligned}$$

Since $\xi _0$ is continuous on $[T,T+\Delta ]$ and f is continuous on $[T,T+\Delta ]\times [0,1]$, they are both bounded. Clearly $e^{\delta ^f(s,\varepsilon )}-1$ tends to zero as $\delta ^f$ tends to zero and hence the operator $\varphi _{1,\xi _0}$ is continuous. Now take a small perturbation $\delta ^f \in \mathcal {C}([T,T+\Delta ]\times [0,1])$:

$$\begin{aligned} \left\| \varphi _2(f+\delta ^f) -\varphi _2(f) \right\| _{\infty } = \sup _{\varepsilon \in [0,1]} \left| \frac{1}{\Delta } \int _T^{T+\Delta } \delta ^f(s,\varepsilon )\mathrm {d}s \right| \le M, \end{aligned}$$

where $M:=\sup _{\varepsilon \in [0,1]}\delta ^f(s,\varepsilon )$. Clearly M tends to zero as $\delta ^f$ tends to zero, thus the operator $\varphi _2$ is also continuous.

For every $s \in [T,T+\Delta ]$ we have the following: by an application of the Contraction Principle [18, Theorem 4.2.1] and using the fact that $\varepsilon \mapsto (\varphi _{1,\xi _0}f)(s,\varepsilon )$ is a bijection for all $f\in \mathcal {C}[T,T+\Delta ]$ it follows that the sequence of stochastic processes $ \left( \left( \varphi _{1,\xi _0} {\widetilde{V}}_s^{g,T,\varepsilon }\right) (s,\varepsilon ) \right) _{\varepsilon \in [0,1]} $ satisfies a large deviations principle on $\mathcal {C}[0,1]$ as $\varepsilon $ tends to zero with speed $\varepsilon ^{-\gamma }$ and rate function

$$\begin{aligned} {\hat{\Lambda }}^V_s(\mathrm {y}):=\Lambda ^V\left( (\varphi _{1,\xi _0} \mathrm {y})^{-1}(s,\cdot ) \right) =\Lambda ^V\left( \log \left( \frac{\mathrm {y}(s,\cdot )}{\xi _0(s)} \right) \right) . \end{aligned}$$

A second application of the Contraction Principle then yields that the sequence of stochastic processes $\left( \ (\varphi _2 (\varphi _{1,\xi _0} {\widetilde{V}}_s^{g,T,\varepsilon })) (\varepsilon ) \right) _{\varepsilon \in [0,1]} $ satisfies a large deviations principle on $\mathcal {C}[0,1]$ with speed $\varepsilon ^{-\gamma }$ and rate function $\Lambda ^{\text {VIX}}(\mathrm {x})= \inf _{s\in [T,T+\Delta ]} \{ \Lambda ^V\left( (\varphi _{1,\xi _0} \mathrm {y})^{-1}(s,\cdot ) \right) : \mathrm {x}(\cdot ) = (\varphi _2\mathrm {y})(\cdot ) \}. $ By definition, the sequence of processes $\left( \ (\varphi _2 (\varphi _{1,\xi _0} {\widetilde{V}}_s^{g,T,\varepsilon })) (\varepsilon ) \right) _{\varepsilon \in [0,1]} $ is almost surely equal to the rescaled VIX processes $(e^{\varepsilon ^{\gamma /2}}\text {VIX}_{T,\varepsilon ^{\gamma /2}})_{\varepsilon \in [0,1]} $ and hence the satisfies the same large deviations principle. $\square $

Appendix C: Numerical recipes

We first consider the simple rough Bergomi (2.4) model for sake of simplicity and further develop the mixed multi-factor rough Bergomi (2.6) model in “Appendix C.3” [which also includes (2.5)]. Therefore, we tackle the numerical computation of the rate function

$$\begin{aligned} {\hat{\Lambda }}^v(\mathrm {y}) := \inf \left\{ \Lambda ^v(\mathrm {x}) : \mathrm {y}= RV(\mathrm {x})(1) \right\} . \end{aligned}$$

This problem, in turn, is equivalent to the following optimisation:

$$\begin{aligned} {\hat{\Lambda }}^v(\mathrm {y}):=\inf _{f\in L^2[0,1]} \left\{ \frac{1}{2}||f||^2 : \mathrm {y}= RV\left( \exp \left( \int _0^\cdot K_{\alpha }(u,\cdot )f(u)\mathrm {d}u\right) \right) (1) \right\} . \end{aligned}$$

(C.1)

A natural approach is to consider a class of functions that is dense in $L^2[0,1]$. The Stone-Weierstrass theorem states that any continuous function on a closed interval can be uniformly approximated by a polynomial function. Consequently, we consider a polynomial basis,

$$\begin{aligned} {\hat{f}}^{(n)}(s)=\sum _{i=0}^n a_i s^i \end{aligned}$$

such that $\{ {\hat{f}}^{(n)} \}_{a_i \in \mathbb {R}}$ is dense in $L^2[0,1]$ as n tends to $+\infty $. Problem (C.1) may then be approximated via

$$\begin{aligned} {\hat{\Lambda }}^v_n(\mathrm {y}):=\inf _{a\in {\mathbb {R}}^{n+1}} \left\{ \frac{1}{2}||{\hat{f}}^{(n)}||^2 : \mathrm {y}= RV\left( \exp \left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{f}}^{(n)}\mathrm {d}u(u)\right) \right) (1) \right\} , \end{aligned}$$

where $a=(a_0,\ldots ,a_n)$. In order to obtain the solution, first the constraint $\mathrm {y}= RV\left( \exp \left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{f}}^{(n)}(u)\mathrm {d}u\right) \right) (1)$ needs to be satisfied. To accomplish this, we consider anchoring one of the coefficients in ${\hat{f}}^{(n)}$ such that

$$\begin{aligned} a_i^*={{\,\mathrm{\arg \!\min }\,}}_{a_i\in {\mathbb {R}}} \left\{ \left( \mathrm {y}-RV\left( \exp \left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{f}}^{(n)}(u)\mathrm {d}u\right) \right) (1)\right) ^2 \right\} \end{aligned}$$

(C.2)

and the constraint will be satisfied for all combinations of the vector $a^*=(a_0,\ldots ,a_{i-1},a_i^*,a_{i+1},\ldots ,a_n)$. Numerically, (C.2) is easily solved using a few iterations of the Newton-Raphson algorithm. Then we may easily solve

$$\begin{aligned} \inf _{a^*\in {\mathbb {R}}^{n+1}} \left\{ \frac{1}{2}||{\hat{f}}^{(n)}||^2\right\} \end{aligned}$$

which will converge to the original problem (C.1) as $n\rightarrow +\infty $. The polynomial basis is particularly convenient since we have that

$$\begin{aligned}&RV\left( \exp \left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{f}}^{(n)}(u)\mathrm {d}u\right) \right) (1)\nonumber \\&\quad =\int _0^1 \exp \left( \eta \sqrt{2\alpha +1} \sum _{i=0}^n\frac{a_i s^{\alpha +1+i} \text { }_2F_1(i+1,-\alpha ,i+2,1)}{i+1}\right) \mathrm {d}s, \end{aligned}$$

(C.3)

where ${}_2F_1$ denotes the Gaussian hypergeometric function. In particular one may store the values $\{ {}_2F_1 \left( i+1,-\alpha ,i+2,1)\right) \}_{i=0}^n$ in the computer memory and reuse them through different iterations. In addition, the outer integral in (C.3) is efficiently computed using Gauss–Legendre quadrature i.e.

$$\begin{aligned}&RV\left( \left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{f}}^{(n)}(u)\mathrm {d}u\right) \right) (1)\\&\quad \approx \frac{1}{2}\sum _{k=1}^m\exp \left( \eta \sqrt{2\alpha +1} \sum _{i=1}^n\frac{a_i \left( \frac{1}{2}(1+p_k)\right) ^{\alpha +1+i} \text { }_2F_1(i+1,-\alpha ,i+2,1)}{i+1}\right) w_k, \end{aligned}$$

where $\{p_k,w_k\}_{k=1}^m$ are m-th order Legendre points and weights respectively.

1.1 C.1: Convergence analysis of the numerical scheme

In Fig. 10 we report the absolute differences $|{\hat{\Lambda }}^v_{n+1}(\mathrm {y})-{\hat{\Lambda }}^v_n(\mathrm {y})|$, for which we observe that the distance decreases as we increase n, the degree of the approximating polynomial. We must enphasize that both numerical routines performing the minimisation steps have a default tolerance of $10^{-8}$, hence we cannot expect to obtain higher accuracies than the tolerance and that is why Fig. 10 becomes noisy once this acuracy is obtained. We note that in Fig. 10 we observe a fast convergence, and with just $n=5$ we usually obtain accuracy of $10^{-5}$.

1.2 C.2. A tailor-made polynomial basis for rough volatility

We may improve the computation time of the previous approach by considering a tailor-made polynomial basis. In particular, recall the following relation

$$\begin{aligned} \int _0^s K_\alpha (u,s) u^k\mathrm {d}u=\frac{u^{\alpha +1+k} {}_2F_1(k+1,-\alpha ,k+2)}{k+1}, \end{aligned}$$

then, for $k=-\alpha -1$ we obtain

$$\begin{aligned} \int _0^s K_\alpha (u,s) u^{-\alpha -1}\mathrm {d}u=\frac{ {}_2F_1(-\alpha ,-\alpha ,1-\alpha ,1)}{-\alpha }, \end{aligned}$$

which in turn is a constant that does not depend on the upper integral bound s.

Proposition C.1

Consider the basis ${\hat{g}}^{(n)}(s)=c s^{-\alpha -1}+\sum _{i=0}^n a_i s^i$, where $c\in {\mathbb {R}}$. Then, for $c=c^*$ with

$$\begin{aligned}&c^*=\frac{-\alpha }{\eta \sqrt{2\alpha +1} \text {}_2F_1(-\alpha ,-\alpha ,1-\alpha ,1) }\\&\quad \log \left( \frac{y}{\int _0^1 \exp \left( \eta \sqrt{2\alpha + 1} (\sum _{i=0}^n\frac{a_i s^{\alpha +1+i} \text { }_2F_1(i+1,-\alpha ,i+2,1)}{i+1}\right) \mathrm {d}s}\right) , \end{aligned}$$

${\hat{g}}^{(n)}(s)$ solves (C.2).

Proof

We have that

$$\begin{aligned}&RV\left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{g}}^{(n)}(u)\mathrm {d}u\right) (1)\\&\quad =\exp \left( \frac{\eta \sqrt{2\alpha +1}}{-\alpha }\text { }_2F_1\left( -\alpha ,-\alpha ,1-\alpha ,1\right) \right) \\&\qquad \times \int _0^1 \exp \left( \eta \sqrt{2\alpha +1} \sum _{i=0}^n\frac{a_i s^{\alpha +1+i} \text { }_2F_1\left( i+1,-\alpha ,i+2,1\right) }{i+1}\right) \mathrm {d}s \end{aligned}$$

and the proof trivially follows by solving $y=RV\left( \int _0^\cdot K_{\alpha }(u,\cdot ){\hat{g}}^{(n)}(u)\mathrm {d}u\right) (1)$. $\square $

Remark C.2

Notice that Proposition C.1 gives a semi-closed form solution to (C.2). Then, we only need to solve

$$\begin{aligned} \inf _{(a_0,\ldots ,a_n)\in {\mathbb {R}}^{n+1}} \left\{ \frac{1}{2}||{\hat{g}}^{(n)}||^2: c=c^*\right\} \end{aligned}$$

in order to recover a solution for (C.1).

Remark C.3

Notice that $u^{-\alpha -1}\notin L^2[0,1]$, however $u^{-\alpha -1}1{1}_{\{u>\varepsilon \}}\in L^2[0,1]$ for all $\varepsilon >0$. Moreover,

$$\begin{aligned}\int _0^s K_\alpha (s,u) u^{-\alpha -1}1{1}_{\{u>\varepsilon \}}\mathrm {d}u&=\frac{\text { }_2F_1(-\alpha ,-\alpha ,1-\alpha ,1)}{-\alpha }-\frac{\varepsilon ^{-\alpha }s^{\alpha }\text { }_2F_1(-\alpha ,-\alpha ,1-\alpha ,\frac{\varepsilon }{t})}{-\alpha }\\ {}&=\frac{\text { }_2F_1(-\alpha ,-\alpha ,1-\alpha ,1)}{-\alpha }+{\mathcal {O}}(\varepsilon ^{-\alpha }), \end{aligned}$$

hence for $\varepsilon $ sufficiently small the error is bounded as long as $\alpha \ne 0$. In our applications we find that this method behaves nicely for $\alpha \in (-0.5,-0.05]$. In Fig. 11 we provide precise errors and we observe that the convergence is better for small $\alpha $ (which is rather surprising behaviour, as the converse is true of other approximation schemes when the volatility trajectories become more rough) as well as strikes around the money. Moreover, the truncated basis approach constitutes a 30-fold speed improvement in our numerical tests. As benchmark we consider the standard numerical algorithm introduced in (C.2), with accuracy measured by absolute error.

1.3 C.3: Multi-factor case

The correlated mixed multi-factor rough Bergomi (2.6) model requires a slightly more complex setting. By Corollary 3.15 the rate function that we aim at is given by following multidimensional optimisation problem:

$$\begin{aligned} {\hat{\Lambda }}^{(v,\Sigma )}(\mathrm {y}):=\inf _{(f_1,\ldots ,f_n)\in L^2[0,1]} \left\{ \frac{1}{2}\sum _{i=1}^n||f_i||^2 : \mathrm {y}= RV\left( \sum _{i=1}^m \gamma _i \exp \left( \frac{\nu ^i}{\eta }\cdot \Sigma _i {\mathfrak {f}}^{K_{\alpha }}_{.}\right) u\right) (1) \right\} ,\qquad \end{aligned}$$

(C.4)

where ${\mathfrak {f}}^{K_{\alpha }}_{.}=\left( \int _0^{\cdot } K_{\alpha }(u,\cdot )f_1(u)\mathrm {d}u,\ldots ,\int _0^{\cdot } K_{\alpha }(u,\cdot )f_n(u)\mathrm {d}u\right) $. The approach to solve this problem is similar to that of (C.1). Nevertheless, in order to solve (C.4) we shall use a multi-dimensional polynomial basis

$$\begin{aligned} \left( {\hat{f}}^{(p)}_1(s),\ldots ,{\hat{f}}^{(p)}_n(s)\right) =\left( \sum _{i=0}^p a^1_i s^i,\ldots ,\sum _{i=0}^p a^n_i s^i\right) \end{aligned}$$

such that each ${\hat{f}}^{(p)}_i(s)$ for $i\in \{1,\ldots ,n\}$ is dense as p tends to $+\infty $ in $L^2[0,1]$ by Stone-Weierstrass Theorem. Then we may equivalently solve

$$\begin{aligned} \inf _{(a_0^1,\ldots ,a_p^1,\ldots ,a_0^n,\ldots ,a_p^n)\in {\mathbb {R}}^{(p+1)n}} \left\{ \frac{1}{2}\sum _{i=1}^n||{\hat{f}}^{(p)}_i||^2 : \mathrm {y}= RV\left( \sum _{i=1}^m \gamma _i \exp \left( \frac{\nu ^i}{\eta }\cdot \Sigma _i \hat{{\mathfrak {f}}}^{(K_{\alpha },p)}_{.}\right) u\right) (1) \right\} ,\nonumber \\ \end{aligned}$$

(C.5)

where $\hat{{\mathfrak {f}}}^{(K_{\alpha },p)}_{.}=\left( \int _0^{\cdot } K_{\alpha }(u,\cdot ){\hat{f}}^{(p)}_1(u)\mathrm {d}u,\ldots ,\int _0^{\cdot } K_{\alpha }(u,\cdot ){\hat{f}}^{(p)}_n(u)\mathrm {d}u\right) $. Then as p tends to $+\infty $, (C.5) will converge to the original problem (C.4). In order to numerically accelerate the optimisation problem in (C.5), we anchor coefficients $(a_0^1,\ldots .,a_0^n)$ to satisfy the constraint $y=RV(\cdot )(1)$ (same way we did in the one dimensional case), that is

$$\begin{aligned} {\mathfrak {a}}^*:=\inf _{(a_0^1,\ldots .,a_0^n)\in {\mathbb {R}}^n} \left\{ \left( \mathrm {y}- RV\left( \sum _{i=1}^m \gamma _i \exp \left( \frac{\nu ^i}{\eta }\cdot \Sigma _i {\mathfrak {f}}^{K_{\alpha }}_{.}\right) u\right) (1)\right) ^2 \right\} \end{aligned}$$

where ${\mathfrak {a}}^*=({\mathfrak {a}}^{1*}_0,\ldots ,{\mathfrak {a}}^{n*}_0)$ and one may use (C.3) and Gauss–Legendre quadrature to efficiently compute $RV(\cdot )(1)$. Then, the constraint will always be satisfied by construction and instead we may solve

$$\begin{aligned} \inf _{({\mathfrak {a}}_0^{1*},a_1^1\ldots ,a_p^1,\ldots ,{\mathfrak {a}}_0^{n*},a_1^n,\ldots ,a_p^n)\in {\mathbb {R}}^{(p+1)n}} \left\{ \frac{1}{2}\sum _{i=1}^n||{\hat{f}}^{(p)}||^2 \right\} . \end{aligned}$$

(C.6)

Appendix D: Exponential equivalence and contraction principle

Definition D.1

On a metric space $({\mathcal {Y}},d)$, two ${\mathcal {Y}}$-valued sequences ${(X^{\varepsilon })}_{\varepsilon > 0}$ and ${({\widetilde{X}}^{\varepsilon })}_{\varepsilon > 0}$ are called exponentially equivalent (with speed $h_\varepsilon $) if there exist probability spaces $(\Omega , {\mathcal {B}}_\varepsilon , \mathbb {P}_\varepsilon )_{\varepsilon >0}$ such that for any $\varepsilon >0$, $\mathbb {P}^{\varepsilon }$ is the joint law of $({\widetilde{X}}^{\varepsilon },X^{\varepsilon })$, $\varepsilon >0$ and, for each $\delta >0$, the set $\left\{ \omega : ({\widetilde{X}}^{\varepsilon },X^{\varepsilon }) \in \Gamma _{\delta } \right\} $ is ${\mathcal {B}}_{\varepsilon }$-measurable, and

$$\begin{aligned} \limsup _{\varepsilon \downarrow 0} h_\varepsilon \log \mathbb {P}^{\varepsilon }\left( \Gamma _\delta \right) =- \infty , \end{aligned}$$

where $\Gamma _\delta := \left\{ ({\tilde{y}},y): d({\tilde{y}},y) > \delta \right\} \subset {\mathcal {Y}}\times {\mathcal {Y}}$.

Theorem D.2

Let ${\mathcal {X}}$ and ${\mathcal {Y}}$ be topological spaces and $f: {\mathcal {X}} \rightarrow {\mathcal {Y}}$ a continuous function. Consider a good rate function $I: {\mathcal {X}} \rightarrow [0,\infty ]$. For each $y\in {\mathcal {Y}}$, define $I'(y) := \inf \{I(x): x\in {\mathcal {X}}, y=f(x)\}$. Then, if I controls the LDP associated with a family of probability measures $\{\mu _\varepsilon \}$ on ${\mathcal {X}}$, the $I'$ controls the LDP associated with the family of probability measures $\{\mu _\varepsilon \circ f^{-1} \}$ on ${\mathcal {Y}}$ and $I'$ is a good rate function on ${\mathcal {Y}}$.

Appendix E: Proof of $v_0\phi (d_1(x))=x \phi (d_2(x))$

In order to prove $v_0\phi (d_1(x))=x \phi (d_2(x))$, we will prove the following equivalent result

$$\begin{aligned} \left( d_1 (x) \right) ^2 - \left( d_2 (x) \right) ^2 = 2 \log \left( \frac{v_0}{x} \right) . \end{aligned}$$

Proof

Recall that $\phi (\cdot )$ is the standard Gaussian probability density function. Using that for $x\ge 0$, $d_1(x)=\frac{\log (v_0)-\log (x)}{{\hat{\sigma }}(x,T)\sqrt{T}}+\frac{1}{2}{\hat{\sigma }}(x,T)\sqrt{T}$ and $d_2(x)=d_1(x)- {\hat{\sigma }}(x,T)\sqrt{T}$, we obtain

$$\begin{aligned} \begin{aligned}\displaystyle \left( d_1 (x) \right) ^2 - \left( d_2 (x) \right) ^2&= \left( d_1 (x) \right) ^2 - \left( d_1 (x) - {\hat{\sigma }}(x,T) \sqrt{T} \right) ^2, \\&= 2 d_1 (x) {\hat{\sigma }}(x,T) \sqrt{T} - T{\hat{\sigma }}^2 (x,T), \\&= 2 \left[ \log (v_0)-\log (x)+\frac{1}{2}{\hat{\sigma }}^2(x,T) T \right] - T{\hat{\sigma }}^2 (x,T), \\&= 2 \log \left( \frac{v_0}{x} \right) . \end{aligned} \end{aligned}$$

$\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lacombe, C., Muguruza, A. & Stone, H. Asymptotics for volatility derivatives in multi-factor rough volatility models. Math Finan Econ 15, 545–577 (2021). https://doi.org/10.1007/s11579-020-00288-5

Download citation

Received: 26 September 2019
Accepted: 06 December 2020
Published: 09 January 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11579-020-00288-5

Asymptotics for volatility derivatives in multi-factor rough volatility models

Abstract

Similar content being viewed by others

The characteristic function of Gaussian stochastic volatility models: an analytic expression

Option pricing under fast-varying and rough stochastic volatility

Second order multiscale stochastic volatility asymptotics: stochastic terminal layer analysis and calibration

1 Introduction

2 A showcase of rough volatility models

Model 2.1

Remark 2.2

Model 2.3

Model 2.4

Remark 2.5

Remark 2.6

Proposition 2.7

Proof

Corollary 2.8

Proof

3 Small-time results for options on integrated variance

Remark 3.1

3.1 Small-time results for the rough Bergomi model

Proposition 3.2

Proof

Theorem 3.3

Proof

Corollary 3.4

Proof

Remark 3.5

Corollary 3.6

Proof

Corollary 3.7

Proof

Corollary 3.8

Proof

Remark 3.9

3.2 Small-time results for the mixed rough Bergomi model

Theorem 3.10

Corollary 3.11

Lemma 3.12

Proof

Corollary 3.13

Proof

3.3 Small-time results for the multi-factor rough Bergomi model

Theorem 3.14

Corollary 3.15

Lemma 3.16

Proof

Corollary 3.17

Proof

4 Numerical results

4.1 RV smiles for rough Bergomi

4.2 RV smiles for mixed rough Bergomi

Remark 4.1

4.3 Linearity of smiles and approximation of the RV density

4.4 RV smiles for mixed multi-factor rough Bergomi

Remark 4.2

5 Options on VIX

Proposition 5.1

Proof

Proposition 5.2

Proof

Theorem 5.3

Proof

Remark 5.4

Remark 5.5

Theorem 5.6

Proof

Remark 5.7

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Approximating the density of realised variance in the mixed rough Bergomi model

Assumption A.1

Remark A.2

Proposition A.3