Skip to main content
Log in

A dominance approach for comparing the performance of VaR forecasting models

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We introduce three dominance criteria to compare the performance of alternative value at risk (VaR) forecasting models. The three criteria use the information provided by a battery of VaR validation tests based on the frequency and size of exceedances, offering the possibility of efficiently summarizing a large amount of statistical information. They do not require the use of any loss function defined on the difference between VaR forecasts and observed returns, and two of the criteria are not conditioned by the choice of a particular significance level for the VaR tests. We use them to explore the potential for 1-day ahead VaR forecasting of some recently proposed asymmetric probability distributions for return innovations, as well as to compare the asymmetric power autoregressive conditional heteroskedasticity (APARCH) and the family of generalized autoregressive conditional heteroskedasticity (FGARCH) volatility specifications with more standard alternatives. Using 19 assets of different nature, the three criteria lead to similar conclusions, suggesting that the unbounded Johnson SU, the skewed Student-t and the skewed Generalized-t distributions seem to produce the best VaR forecasts. The unbounded Johnson SU distribution performs remarkably well, while symmetric distributions seem clearly inappropriate for VaR forecasting. The added flexibility of a free power parameter in the conditional volatility in the APARCH and FGARCH models leads to a better fit to return data, but it does not improve upon the VaR forecasts provided by GARCH and GJR-GARCH volatilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. For recent work on expected shortfall backtesting see Novales and Garcia-Jorcano (2019), Du and Escanciano (2016), Acerbi and Szekely (2014), Righi and Ceretta (2013), among others.

  2. Notice that this concept of dominance is different from the stochastic dominance that consists on partial orders between random variables based on shared preferences regarding sets of possible outcomes and their associated probabilities.

  3. As returns are known to have low serial correlation in levels, many papers consider AR(1) for the conditional mean of financial returns (see among others, Diamandis et al. 2011; Angelidis and Degiannakis 2008; Giot and Laurent 2003a, b). Furthermore, Novales and Garcia-Jorcano (2017) show that the important assumption for VaR performance is the probability distribution of the innovations, with the choice of volatility model playing a secondary role.

  4. Along the paper we think of a VaR model as a combination of a probability distribution and a volatility specification for return innovations.

  5. Even though \(I1_{12}=1\) in all those cases, the dominance relationship will be stronger for models with a higher number of rejections.

  6. The estimation results for APARCH volatility model under the different probability distributions for the stock market indices and for individual stocks in our sample are reported in Tables A1 and A2 of the Online Appendix.

  7. With the only exception of the Australian dollar.

  8. Other model specifications are dominated in just a few cases, like SKST-GARCH, JSU-GARCH, SGT-GARCH, and SGT-GJRGARCH, but they do not come up as dominant models in the first three columns.

  9. As shown in column 3 these are again generally the models that come up as dominant in pairwise comparisons according to this criterion, while rarely being dominated.

  10. This is computed as the average rank of all models incorporating a given probability distribution for standardized return innovations.

  11. Our test results could also be examined from the perspective of each asset, although the number of tests available is then considerably reduced and the ranking of models will be subject to significant sampling error.

  12. In the same way than a shorter information set reduces the power of a statistical test.

  13. In this case, we first estimated the AR(1)-GARCH conditional mean-volatility model assuming a generalized error distribution for the standardized innovations, as suggested by Bali and Theodossiou (2007). The parameters of the skewed generalized-t distribution were estimated in a second stage using the standardized returns (\(\frac{r_t-\phi _0-\phi _1 r_{t-1}}{\sigma _t}=\frac{\varepsilon _t}{\sigma _t}\)) obtained in the first step.

  14. Lambert and Laurent (2001) and Giot and Laurent (2003a) have shown that for various financial daily returns, it is realistic to assume that standardized innovations \({\hat{z}}_t\) follow a skewed Student-t distribution.

  15. The skewness parameter \(\xi >0\) is defined such that the ratio of probability masses above and below the mean is

    $$\begin{aligned} \frac{Prob(z\ge 0|\xi )}{Prob(z< 0|\xi )}=\xi ^2 \end{aligned}$$
  16. which is an extension of the generalized error distribution (GED) studied by Nelson (1991).

  17. This parameterization is used by the R package rugarch, which we use for estimating the parameters of our models.

References

  • Aas K, Haff IH (2006) The generalized hyperbolic skew student’s t-distribution. J Financ Econom 4(2):275–309

    Google Scholar 

  • Abad P, Benito S, Lopez C, Sanchez-Granero MA (2016) Evaluating the performance of the skewed distributions to forecast value-at-risk in the global financial crisis. J Risk 18(5):1–28

    Google Scholar 

  • Abramowitz M, Stegun IA (1972) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, vol 9. Dover, New York

    MATH  Google Scholar 

  • Acerbi C, Szekely B (2014) Backtesting Expected Shortfall. Publication of MSCI. https://www.msci.com/www/research-paper/research-insight-backtesting/0128184734

  • Angelidis T, Degiannakis S (2008) Volatility forecasting: intra-day versus inter-day models. J Int Financ Mark Inst Money 18(5):449–465

    Google Scholar 

  • Bali TG, Theodossiou P (2007) A conditional-SGT-VaR approach with alternative GARCH model. Ann Oper Res 151:241–267

    MathSciNet  MATH  Google Scholar 

  • Basel Committee on Banking Supervision, Standards: Minimum Capital requirements for market risk (2016). Bank for International Settlements

  • Bao Y, Lee T, Saltoglu B (2006) Evaluating predictive performance of value-at-risk models in emerging markets: a reality check. J Forecast 25:101–128

    MathSciNet  Google Scholar 

  • Bhattacharyya M, Ritolia G (2008) Conditional VaR using EVT: towards a planned margin scheme. Int Rev Financ Anal 17(2):382–395

    Google Scholar 

  • Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31:307–327

    MathSciNet  MATH  Google Scholar 

  • Braione M, Scholtes NK (2016) Forecasting Value-at-Risk under different distributional assumptions. Econometrics 4(3)

  • Cai Y, Krishnamoorthy K (2006) Exact size and power properties of five tests for multinomial proportions. Commun Stat-Simul Comput 35(1):149–160

    MathSciNet  MATH  Google Scholar 

  • Caporin M (2008) Evaluating Value-at-Risk measures in the presence of long memory conditional volatility. J Risk 10(3):79–110

    Google Scholar 

  • Choi P, Nam K (2008) Asymmetric and leptokurtic distribution for heteroscedastic asset returns: the SU-normal distribution. J Empir Finance 15(1):41–63

    Google Scholar 

  • Christoffersen P (1998) Evaluating internal forecasting. Int Econ Rev 39:841–862

    MathSciNet  Google Scholar 

  • Colletaz G, Hurlin C, Pérignon C (2013) The Risk Map: A new tool for validating risk models. J Bank Finance 37(10):3843–3854

    Google Scholar 

  • Corlu CG, Meterelliyoz M, Tiniç M (2016) Empirical distributions of daily equity index returns: a comparison. Expert Syst Appl 54:170–192

    Google Scholar 

  • Diamandis PF, Drakos AA, Kouretas GP, Zarangas L (2011) Value-at-Risk for long and short trading positions: evidence from developed and emerging equity markets. Int Rev Financ Anal 20:165–176

    Google Scholar 

  • Ding Z, Granger CWJ, Engle RF (1993) A long memory property of stock market returns and a new model. J Empir Finance 1:83–106

    Google Scholar 

  • Du Z, Escanciano JC (2016) Backtesting expected shortfall: accounting for tail risk. Manag Sci 63(4):940–958

    Google Scholar 

  • Engle RF, Manganelli S (2004) CAViaR: conditional autoregressive value at risk by regression quantiles. J Bus Econ Stat 22:367–381

    MathSciNet  Google Scholar 

  • Fernandez C, Steel M (1998) On bayesian modelling of fat tails and skewness. J Am Stat Assoc 93(441):359–371

    MATH  Google Scholar 

  • Gerlach R, Chen CWS, Lin EMH, Lee WCW (2011) Bayesian forecasting for financial risk management, pre and post the global financial crisis. J Forecast 31(8):661–687

    MathSciNet  MATH  Google Scholar 

  • Giacomini R, Komunjer I (2005) Evaluation and combination of conditional quantile forecasts. J Bus Econ Stat 23(4):416–431

    MathSciNet  Google Scholar 

  • Giot P, Laurent S (2003a) Value-at-Risk for long and short trading positions. J Appl Econom 18:641–664

    Google Scholar 

  • Giot P, Laurent S (2003b) Market risk in commodity markets: a VaR approach. Energy Econ 25(5):435–457

    Google Scholar 

  • Glosten L, Jagannathan R, Runkle D (1993) On the relation between the expected value and the volatility of the nominal excess return on stocks. J Finance 48:1779–1801

    Google Scholar 

  • Hentschel L (1995) All in the family nesting symmetric and asymmetric GARCH models. J Financ Econ 39:71–104

    Google Scholar 

  • Hu W (2005) Calibration of multivariate generalized hyperbolic distributions using the EM algorithm, with applications in risk management, portfolio optimization and portfolio credit risk. Dissertation in the Florida State University. http://diginole.lib.fsu.edu/islandora/object/fsu:181953/datastream/PDF/view

  • Johnson NL (1949) Systems of frequency curves generated by methods of translations. Biometrika 36:149–176

    MathSciNet  MATH  Google Scholar 

  • Kratz M, Lok YH, McNeil AJ (2018) Multinomial VaR Backtests: a simple implicit approach to backtesting expected shortfall. J Bank Finance 88:393–407

    Google Scholar 

  • Kupiec P (1995) Techniques for verifying the accuracy of risk measurement models. J Deriv 2:174–184

    Google Scholar 

  • Lambert P, Laurent S (2001) Modelling financial time series using GARCH-type models with a skewed student distribution for the innovations. Université de Liege, Mimeo

    Google Scholar 

  • Leccadito A, Boffelli S, Urga G (2014) Evaluating the accuracy of value-at-risk forecasts: new multilevel tests. Int J Forecast 30:206–216

    Google Scholar 

  • Lee CF, Su JB (2015) Value-at-Risk estimation via a semiparametric approach: evidence from the stock markets. Handbook of Financial Econometrics and Statistics. Springer, New York

    Google Scholar 

  • Lopez JA (1998) Testing your risk tests. Financial Survey (May-Jun), 18–20

  • Lopez JA (1999) Methods for evaluating Value-at-Risk estimates. Federal Reserve Bank San Francisco Econ Rev 2:3–17

    Google Scholar 

  • Louzis DP, Xanthopoulos-Sisinis S, Refenes AP (2014) Realized volatility models and alternative Value-at-Risk prediction strategies. Econ Model 40:101–116

    MATH  Google Scholar 

  • McDonald JB, Newey WK (1988) Partially adaptive estimation of regression models via the generalized t distribution. Econom Theory 4(3):428–457

    MathSciNet  Google Scholar 

  • Mina J, Ulmer A (1999) Delta-gamma four ways. Technical report, RiskMetrics Group, pp 1–17

  • Mittnik S, Paolella M (2000) Conditional density and value-at-risk prediction of Asian currency exchange rates. J Forecast 19:313–333

    Google Scholar 

  • Nakajima J, Omori Y (2012) Stochastic volatility model with leverage and asymmetrically heavy-tailed error using GH skew Student’s t-distribution. Comput Stat Data Anal 56(11):690–3704

    MathSciNet  MATH  Google Scholar 

  • Nelson DB (1991) Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59(2):347–370

    MathSciNet  MATH  Google Scholar 

  • Novales A, Garcia-Jorcano L (2019) Backtesting extreme value theory models of expected shortfall. Quant Finance 19(5):799–825. https://doi.org/10.1080/14697688.2018.1535182

    Article  MathSciNet  MATH  Google Scholar 

  • Novales A, Garcia-Jorcano L (2017) Volatility specifications versus probability distributions in VaR forecasting. Available at SSRN https://ssrn.com/abstract=3023885

  • Nozari M, Raei S, Jahanguin P, Bahramgiri M (2010) A comparison of heavy-tailed estimates and filtered historical simulation: evidence from emerging markets. Int Rev Bus Pap 6(4):347–359

    Google Scholar 

  • Ozun A, Cifter A, Yilmazer S (2010) Filtered extreme-value theory for value-at-risk estimation: evidence from Turkey. J Risk Finance 11(2):164–179

    Google Scholar 

  • Paolella MS, Polak P (2015) COMFORT: a common market factor non-Gaussian returns model. J Econom 187(2):593–605

    MathSciNet  MATH  Google Scholar 

  • Pérignon C, Smith DR (2008) A new approach to comparing VaR estimation methods. J Deriv 16(2):54–66

    Google Scholar 

  • Righi MB, Ceretta PS (2013) Individual and flexible expected shortfall backtesting. J Risk Model Valid 7(3):3–20

    Google Scholar 

  • Riskmetrics TM (1996) JP Morgan Technical Document

  • Sarma M, Thomas S, Shah A (2003) Selection of value at risk models. J Forecast 22:337–358

    Google Scholar 

  • Schwert W (1990) Stock volatility and the crash of ’87. Rev Financ Stud 3:77–102

    Google Scholar 

  • Simonato JG (2011) The performance of Johnson distributions for computing value at risk and expected shortfall. J Deriv 19(1):7–24

    Google Scholar 

  • Taylor SJ (1986) Modelling financial time series. Wiley, Hoboken

    MATH  Google Scholar 

  • Theodossiou P (1998) Financial data and skewed generalized t distribution. Manag Sci 44:1650–1661

    MATH  Google Scholar 

  • Yu PLH, Li WK, Jin S (2010) On some models for value-at-risk. Econom Rev 29(5–6):622–641

    MathSciNet  MATH  Google Scholar 

  • Zangari P (1996) An improved methodology for measuring VaR. RiskMetrics Monitor, 2nd quarter, 7-25

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfonso Novales.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors gratefully acknowledge financial support from the grants Ministerio de Economía y Competitividad ECO2015-67305-P, Generalitat Valenciana PrometeoII/2013/015, Programa de Ayudas a la Investigación en Macroeconomía, Economía Monetaria, Financiera y Bancaria e Historia Económica 2015-2016 from Banco de España.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 129 KB)

Appendix

Appendix

1.1 Volatility models and probability distributions

Let \(r_t\), for \(t=1,...,T\), be a time series of asset returns. It is convenient to break down the complete characterization of \(r_t\) into three components: (i) the conditional mean, \(\mu _t\); (ii) the conditional variance, \(\sigma _{t}^2\); and (iii) the shape parameters, which determine the form of a conditional distribution (e.g. skewness or kurtosis) within a general family of distributions. Thus, we may write

$$\begin{aligned} r_t=\mu _t(\theta )+\varepsilon _t, \qquad \mu _t(\theta )={\mathbb {E}}[x_t|{\mathscr {F}}_{t-1}]=\mu (\theta ,{\mathscr {F}}_{t-1}), \qquad \varepsilon _t=\sigma _t(\theta )z_t,\\ \sigma _{t}^2(\theta )={\mathbb {E}}[(x_t-\mu _t)^{2}|{\mathscr {F}}_{t-1}]=\sigma ^2(\theta ,{\mathscr {F}}_{t-1}), \qquad z_t\sim f(z_t|\theta ).\\ \end{aligned}$$

The standardized innovation, \(z_t=(r_t-\mu _t(\theta ))/\sigma _t(\theta )\) has zero mean and unit variance. It follows a conditional distribution f with shape parameters that capture the possible asymmetry and fat-tailedness of returns, except in the case of the normal distribution. The vector \(\theta \) contains all the parameters associated with the conditional mean and variance and the conditional distribution.

An AR(1) model for the conditional mean return is sufficient to produce serially uncorrelated innovations for all assets. For all the models we jointly ontain estimate by the likelihood estimates for the parameters in mean return equation, its conditional variance, and the probability distribution for standardized return innovations. The exception is the skewed generalized-t distribution, for which we use a two-step estimation method because of the numerical difficulty of estimating all its parameters jointly.Footnote 13

1.1.1 Volatility models

The conditional variance of the GARCH(1,1) model [ Bollerslev (1986)] is used as a benchmark, i.e.

$$\begin{aligned} \sigma _{t}^{2}=\omega +\alpha _{1}\varepsilon _{t-1}^{2}+\beta _{1}\sigma _{t-1}^{2}, \end{aligned}$$

where \(\omega >0\), \(\alpha _{1},\beta _{1}\ge 0\), \(\alpha _{1}+\beta _{1}<1\).

The standard GARCH model detects the existence of volatility clustering but it assumes that positive and negative error terms have the same effect on volatility. To incorporate asymmetric effects on volatility from positive and negative surprises, Glosten et al. (1993) proposed the GJR-GARCH(1,1) model, incorporating the negative impact of leverage in the conditional variance equation via the use of the indicator function \(I(\varepsilon _{t-i}\le 0)\), so that the variance equation becomes

$$\begin{aligned} \sigma _{t}^{2}=\omega +\left[ \alpha _{1}\varepsilon _{t-1}^{2}+\gamma _{1}I(\varepsilon _{t-1}\le 0)\varepsilon ^{2}_{t-1}\right] +\beta _{1}\sigma _{t-1}^{2}. \end{aligned}$$

The volatility effect of a unit negative shock is \(\alpha _{1}+\gamma _{1}\), while the effect of a unit positive shock is \(\alpha _{1}\). A positive value of \(\gamma _{1}\) reflects that a negative innovation generates greater volatility than a positive innovation of equal size, and on the contrary for a negative value of \(\gamma _{1}\).

The APARCH model (Asymmetric Power ARCH model) was proposed by Ding et al. (1993). This model incorporates volatility clustering, fat tails, excess kurtosis, the leverage effect, and the Taylor (1986) effect that the sample autocorrelation of absolute returns is usually larger than that of squared returns. The APARCH(1,1) variance equation is

$$\begin{aligned} \sigma _{t}^{\delta }=\omega + \alpha _{1}(|\varepsilon _{t-1}|-\gamma _{1}\varepsilon _{t-1})^{\delta }+\beta _{1}(\sigma _{t-1})^{\delta }, \end{aligned}$$

where \(\omega \), \(\alpha _{1}\), \(\gamma _{1}\), \(\beta _{1}\), and \(\delta \) are additional parameters to be estimated. The parameter \(\gamma _{1}\) reflects the leverage effect \((-1<\gamma _{1}<1)\). A positive (resp. negative) value of \(\gamma _{1}\) means that past negative (resp. positive) shocks impact current conditional volatility more deeply than past positive (resp. negative) shocks. The parameter \(\delta \) plays the role of a Box-Cox transformation of \(\sigma _{t}(\delta >0)\). The APARCH equation is supposed to satisfy the conditions: i) \(\omega >0\) (since the variance is positive), \(\alpha _{1}\ge 0\) and \(\beta _{1}\ge 0\), and, when \(\alpha _{1}=0\) and \(\beta _1=0\), then \(\sigma _{t}^{2}=\omega \); ii) \(0\le \alpha _{1}+\beta _{1}\le 1\). The APARCH model has great flexibility, having as special cases the GARCH and GJR-GARCH models, among others.

The FGARCH model of Hentschel (1995) is more general than the APARCH model, since it allows for the decomposition of the residuals in the conditional variance equation to be driven by different powers for \(z_t\) and \(\sigma _t\). It also allows for both shifts and rotations in the news impact curve, where the shift is the main source of asymmetry for small shocks while rotation drives the asymmetry for large shocks. The FGARCH(1,1) is defined as

$$\begin{aligned} \sigma _t^{\lambda }=\omega +\alpha _{1}\sigma _{t-1}^{\lambda }f^{\delta }(z_{t-1})+\beta _{1}(\sigma _{t-1})^{\lambda }, \end{aligned}$$

where \(f^{\delta }(z_{t-1})=(|z_{t-1}-\eta _{21}|-\eta _{11}(z_{t-1}-\eta _{21}))^{\delta }\).

Positivity of \(f^{\delta }(z_{t-1})\) is guaranteed when \(|\eta _{11}|\le 1\), which ensures that neither arm of the rotated absolute value function crosses the abscissa. However, the parameter \(\eta _{21}\) is unrestricted in size and sign. The magnitude and direction of a shift in the news impact curve are controlled by the parameter \(\eta _{21}\), while the magnitude and direction of a rotation in the news impact curve are controlled by the parameter \(\eta _{11}\). Other GARCH models permit only a shift or a rotation, but not both. By allowing for shifts in the news impact curve, the FGARCH model is more flexible than previous models, being able to capture asymmetries in volatility even in the presence of small shocks.

1.1.2 Probability distributions

As probability distributions for the standardized innovations we compare the performance of the skewed Student-t, skewed generalized error (GED), unbounded Johnson \(S_U\), skewed generalized-t (SGT), and generalized hyperbolic skew Student-t distributions, with the normal and symmetric Student-t distributions as benchmarks.

To account for the excess skewness and kurtosis typical of financial data, the parametric volatility models presented in the previous section can be combined with skewed and leptokurtic distributions for return standardized innovations. The skewed Student-t distribution of Fernandez and Steel (1998) and Lambert and Laurent (2001)Footnote 14 is

$$\begin{aligned} f(z|\xi ,\nu )=\frac{2}{\xi +\frac{1}{\xi }}s\{g[\xi (sz+m)|\nu ]I_{(-\infty ,0)}(z+m/s)+ g[(sz+m)/\xi |\nu ]I_{[0,\infty )}(z+m/s)\}, \end{aligned}$$
(1)

where \(g(\cdot |\nu )\) is the symmetric (unit variance) Student-t density and \(\xi \) is the skewness parameter;Footnote 15m and \(s^2\) are, respectively the mean and the variance of the non-standardized skewed Student-t and are defined by

$$\begin{aligned} E(\varepsilon |\xi )= & {} M_1(\xi -\xi ^{-1})\equiv m, \\ V(\varepsilon |\xi )= & {} (M_2-M_1^2)(\xi ^2+\xi ^{-2})+2M_1^2-M_2 \equiv s^2, \end{aligned}$$

where \(M_r=2\int _0^{\infty }s^rg(s)ds\) is the absolute moment generating function. When \(\xi =1\) and \(\nu >2\), we have the skewness and the kurtosis of the (standardized) Student-t distribution. When \(\xi =1\) and \(\nu =+\infty \), we get the skewness and kurtosis of the Gaussian density.

An alternative distribution for standardized return innovations \(z_t\) that can capture skewness and kurtosis is the standardized skewed generalized error distribution, \(SGED(0,1,\xi ,\kappa )\), of Lambert and Laurent, with densityFootnote 16

$$\begin{aligned} f(z|\xi ,\kappa )=\frac{2}{\xi +\frac{1}{\xi }}s\{g[\xi (sz+m)|\kappa ]I_{(-\infty ,0)}(z+m/s)+ g[(sz+m)/\xi |\kappa ]I_{[0,\infty )}(z+m/s)\}, \end{aligned}$$

where \(g(\cdot |\kappa )\) is the symmetric (unit variance) generalized error distribution, \(\xi \) is the skewness parameter, \(\kappa \) represents the shape parameter, and \(\varGamma (\cdot )\) is the gamma function. The mean (m) and standard deviation (s) are calculated in the same way as in the case of the skewed Student-t distribution. As \(\kappa \) increases the density gets flatter and flatter while in the limit, as \(\kappa \rightarrow \infty \), the distribution tends toward the uniform distribution. Special cases are the normal distribution, when \(\kappa =2\), and the Laplace distribution, when \(\kappa =1\). For \(\kappa >2\) the distribution is platykurtic and for \(\kappa <2\) it is leptokurtic.

Another alternative is the Johnson \(S_U\) distribution. It was one of the distributions derived by Johnson (1949) based on translating the normal distribution by certain functions. Letting \(Y\sim N(0,1)\), the standard normal distribution, the random variable Z has the Johnson system of frequency curves if it is a transformation of Y of the form \(Y=\gamma +\delta g((Z-\xi )/\lambda )\). The form of the resulting distribution depends on the choice of function g. When \(g(u)=sinh^{-1}(u)\), the resulting unbounded distribution is called the Johnson \(S_U\) distribution. The parameters of the distribution are \(\xi \), \(\lambda >0\), \(\gamma \), and \(\delta >0\).

We use a parameterizationFootnote 17 of the original Johnson \(S_U\) distribution such that the parameters \(\xi \) and \(\lambda \) are the mean and standard deviation of the distribution. The parameter \(\gamma \) determines the skewness of the distribution with \(\gamma >0\) indicating positive skewness and \(\gamma <0\) negative skewness. The parameter \(\delta \) determines the kurtosis of the distribution, and \(\delta \) should be positive and most likely above 1.

The pdf of the Johnson’s \(S_U\), denoted here by \(JSU(\xi ,\lambda ,\gamma ,\delta )\), is defined by

$$\begin{aligned} f_Z(z)=\frac{\delta }{c\lambda }\frac{1}{\sqrt{(r^2+1)}}\frac{1}{\sqrt{2\pi }}exp\left[ -\frac{1}{2}y^2\right] , \end{aligned}$$

with

$$\begin{aligned} y= & {} -\gamma +\delta sinh^{-1}(r)=-\gamma +\delta \log \left[ r+(r^2+1)^{1/2}\right] , \\ r= & {} \frac{z-(\xi +c\lambda \omega ^{1/2}sinh\varOmega )}{c\lambda }, \\ c= & {} \left\{ \frac{1}{2}(\omega -1)[\omega cosh2\varOmega +1]\right\} ^{-1/2}, \end{aligned}$$

where \(\omega =exp(\delta ^{-2})\) and \(\varOmega =-\gamma /\delta \). Note that \(Y\sim N(0,1)\). Here \(E(Z)=\xi \) and \(Var(Z)=\lambda ^2\).

The skewed generalized-t distribution proposed by Theodossiou (1998) is a quite flexible distribution extending the generalized-t distribution of McDonald and Newey (1988). It has probability density function

$$\begin{aligned} f(x|\mu ,\sigma ,\lambda ,p,q)=\frac{p}{2\nu \sigma q^{1/p}B(\frac{1}{p},q)\left( \frac{|x-\mu + m|^p}{q(\nu \sigma )^p(\lambda sign(x-\mu + m)+1)^p}+1\right) ^{\frac{1}{p}+q}}, \end{aligned}$$

with

$$\begin{aligned}&\displaystyle m=\frac{2\nu \sigma \lambda q^{\frac{1}{p}}B\left( \frac{2}{p},q-\frac{1}{p}\right) }{B\left( \frac{1}{p},q\right) }, \\&\displaystyle \nu =q^{-\frac{1}{p}}\left[ (3\lambda ^2+1)\left( \frac{B\left( \frac{3}{p},q-\frac{2}{p}\right) }{B\left( \frac{1}{p},q\right) }\right) -4\lambda ^2\left( \frac{B\left( \frac{2}{p},q-\frac{1}{p}\right) }{B\left( \frac{1}{p},q\right) }\right) ^2\right] ^{-\frac{1}{2}}, \end{aligned}$$

where \(B(\cdot )\) is the beta function, and \(\mu \), \(\sigma \), \(\lambda \), p, and q are the location, scale, skewness, peakedness, and tail-thickness parameters, respectively, with \(\sigma >0\), \(-1<\lambda <1\), \(p>0\), and \(q>0\). The skewness parameter \(\lambda \) controls the rate of descent of the density around \(x=0\). The parameters p and q control the height and tails of the density, respectively. The parameter q has the degrees of freedom interpretation in the case \(\lambda =0\) and \(p=2\).

The distributions belonging to the generalized hyperbolic family are more complex and novel. A special case of this family is the generalized hyperbolic skew Student-t distribution proposed by Aas and Haff (2006). This distribution has the important property that one tail has polynomial behavior while the other tail has exponential behavior. Further, it is the only subclass of the generalized hyperbolic family of distributions having this property. This is an alternative for modeling the empirical distribution of financial returns. It is often skewed, having one heavy and one semiheavy or Gaussian-like tail. The skew extensions to the Student-t distribution, like that of Fernandez and Steel, have two tails behaving polynomially. This means that they fit heavy-tailed data well, but they do not serve for modeling substantial skewness, since that requires one heavy tail and one non-heavy tail.

The probability density function of the generalized hyperbolic skew Student-t is given by

$$\begin{aligned} f_X(x)=\frac{2^{\frac{1-\nu }{2}}\delta ^{\nu }|\beta |^{\frac{\nu +1}{2}}K_{\frac{\nu +1}{2}}\left( \sqrt{\beta ^2(\delta ^2+(x-\mu )^2)}\right) exp(\beta (x-\mu ))}{\varGamma (\frac{\nu }{2})\sqrt{\pi }\left( \sqrt{\delta ^2+(x-\mu )^2}\right) ^{\frac{\nu +1}{2}}}, \qquad \beta \ne 0, \end{aligned}$$

and

$$\begin{aligned} f_X(x)=\frac{\varGamma (\frac{\nu +1}{2})}{\sqrt{\pi }\delta \varGamma (\frac{\nu }{2})}\left[ 1+\frac{(x-\mu )^2}{\delta ^2}\right] ^{-(\nu +1)/2}, \qquad \beta = 0, \end{aligned}$$

where \(K_{\nu }(x) \sim \sqrt{\frac{\pi }{2x}}exp(-x)\) for \(x \rightarrow \pm \infty \) is the modified Bessel function [Abramowitz and Stegun 1972], and \(\mu \), \(\delta \), \(\beta \), and \(\nu \) determine the location, scale, skew, and shape parameters, respectively.

When \(\beta =0\) the density \(f_X(x)\) is that of a noncentral Student-t distribution with \(\nu \) degrees of freedom, expectation \(\mu \), and variance \(\delta ^2/(\nu -2)\).

1.2 VaR backtesting

The unconditional coverage test introduced by Kupiec (1995) is based on the number of violations, i.e. the number of times (\(T_1\)) that returns exceed the predicted VaR over a period of time T for a given significance level. If the VaR model is correctly specified, the failure rate (\({\hat{\pi }}=\frac{T_1}{T}\)) should be equal to the prespecified VaR level (\(\alpha \)). The null hypothesis \(H0:\pi =\alpha \) is evaluated through the likelihood ratio test

$$\begin{aligned} LR_{uc}=-2\ln \left( \frac{L(\varPi _{\alpha })}{L({\widehat{\varPi }})}\right) =-2\ln \left( \frac{(1-\alpha )^{T_0}\alpha ^{T_1}}{(1-{\hat{\pi }})^{T_0}{\hat{\pi }}^{T_1}}\right) \quad {\mathop {\longrightarrow }\limits ^{T \rightarrow \infty }}\chi _1^2, \end{aligned}$$

where \(T_0=T-T_1\).

The dynamic quantile test proposed by Engle and Manganelli (2004) overcomes some drawbacks of the conditional coverage test of (Christoffersen 1998) using a linear regression model that links current violations to past violations. We define the auxiliary variable \(Hit_t(\alpha )=I_t(\alpha )-\alpha \), so that \(Hit_t(\alpha )=1-\alpha \) if \(r_t<VaR_{t|t-1}(\alpha )\) and \(Hit_t(\alpha )=-\alpha \) otherwise, where \(I_t(\alpha )\) is equal to 1 if \(r_t<VaR_{t|t-1}(\alpha )\) and equal to 0 otherwise. The null hypothesis of this test is that the sequence of hits (\(Hit_t\)) is uncorrelated with any variable that belongs to the information set \(\varOmega _{t-1}\) available when the VaR was calculated and it has a mean value of zero, which implies, in particular, that the hits are not autocorrelated. The dynamic quantile test is a Wald test of the null hypothesis that all slopes in the regression model

$$\begin{aligned} Hit_t(\alpha )=\delta _0+\sum _{i=1}^{p}\delta _{i}Hit_{t-i}+\sum _{j=1}^{q}\delta _{p+j}X_j+\epsilon _t, \end{aligned}$$

are zero, where \(X_j\) are explanatory variables contained in \(\varOmega _{t-1}\). The test statistic has an asymptotic \(\chi _{p+q+1}^2\) distribution. In our implementation of the test, we use \(p=5\) and \(q=1\) (where \(X_1=VaR(\alpha )\)) as proposed by Engle and Manganelli (2004). By doing so, we are testing whether the probability of an exception depends on the level of the VaR.

To account for both the number and magnitude of extreme losses, we also evaluate the performance of VaR models using the Multinomial VaR backtests (Kratz et al. 2018) and the Risk Map test (Colletaz et al. 2013).

The Multinomial VaR backtests introduced by Kratz et al. (2018) are based on testing simultaneously VaR estimates at N levels leads to a multinomial distribution. Let \(X_t=\sum _{j=1}^{N}I_{t,j}\), where \(I_{t,j}=I_{r_t<VaR_{t|t-1}(\alpha _j)}\) is the exception indicator of the level \(\alpha _{j}\) at time t, then the sequence \((X_t)_{t=1,...,n}\) counting the number of VaR levels that are breached should satisfy two conditions: i) the unconditional coverage hypothesis, \(P(X_t\le j)=\alpha _{j+1},\quad j=0, ..., N\quad \forall t\), and ii) the independence hypothesis, \(X_t\) is independent of \(X_s\) for \(s\ne t\). Let \(MN(n, (p_0, ..., p_N))\) denotes the multinomial distribution with n trials, each of which may result in one of \(N+1\) outcomes 0, 1, ..., N according to probabilities \(p_0,...,p_N\) that sum to one. If we define observed cell counts by, \(O_j=\sum _{t=1}^{n}I_{{X_t=j}}, j=0,1,...,N,\) then, under the unconditional coverage and independence assumptions, the random vector \((O_0, ..., O_N)\) should follow the multinomial distribution \((O_0,...,O_N)\sim MN(n, \alpha _1-\alpha _0, ..., \alpha _{N+1}-\alpha _{N})\), where \(\alpha _0=0\) and \(\alpha _{N+1}=1\). More formally, let \(0=\theta _0<\theta _1< ...<\theta _N<\theta _{N+1}=1\) be an arbitrary sequence of parameters and consider the model, \((O_0,...,O_N)\sim MN(n, \theta _1-\theta _0, ..., \theta _{N+1}-\theta _{N})\). We test null and alternative hypothesis given by,

$$\begin{aligned} H_0:&\ \theta _j=\alpha _j \ \text {for}\quad j=1, ..., N\\ H_1:&\ \theta _j\ne \alpha _j\ \text {for at least one}\ j\in \{1, ..., N\} \end{aligned}$$

Various test statistics can be used to evaluate these hypothesis. Kratz et al. (2018) propose three of five possible tests of multinomial proportions provided by Cai and Krishnamoorthy (2006): the standard Pearson chi-squared test, the Nass test, and the likelihood ratio test. We use the Nass test, which performs better with small cell counts, with a test statistic,

$$\begin{aligned} c\cdot S_N {\mathop {\sim }\limits ^{d}} \chi _{\nu }^2 \end{aligned}$$

with

$$\begin{aligned} S_N=\sum _{j=0}^{N}\frac{(O_{j+1}-n(\alpha _{j+1}-\alpha _j))^2}{n(\alpha _{j+1}-\alpha _j)}, \quad c=\frac{2{\mathbb {E}}(S_N)}{{\mathbb {V}}ar(S_N)}\quad \text {and} \quad \nu =c{\mathbb {E}}(S_N), \end{aligned}$$

where \({\mathbb {E}}(S_N)=N\) and \({\mathbb {V}}ar(S_N)=2N-\frac{N^2+4N+1}{n}+\frac{1}{n}\sum _{j=0}^{N}\frac{1}{\alpha _{j+1}-\alpha _j}\).

The Risk Map test introduced by Colletaz et al. (2013) consists on the VaR backtest at two levels to account for both the number and magnitude of extreme losses. This approach exploits the concept of super exception, which they define as a loss greater than \(VaR(\alpha ')\), with \(\alpha '\) smaller than \(\alpha \). An abnormally high frequency of super exceptions would suggest that the magnitude of the losses with respect to \(VaR(\alpha )\) is too large. The approach consists on jointly testing the number of VaR exceptions and super exceptions,

$$\begin{aligned} H_0:{\mathbb {E}}[I_t(\alpha )]=\alpha \quad \text {and} \quad {\mathbb {E}}[I_t(\alpha ')]=\alpha '. \end{aligned}$$

This joint null hypothesis can be tested using either a multivariate version of the unconditional coverage test (\(LR_{uc}\)). The authors follow Pérignon and Smith (2008) and define several indicator variables for revenues falling in each disjoint interval

$$\begin{aligned} J_{1,t}= & {} I_t(\alpha )-I_t(\alpha ')=\left\{ \begin{array}{ll} 1\quad \text {if}\quad VaR_{t|t-1}(\alpha ')<r_t<VaR_{t|t-1}(\alpha )\\ 0\quad \text {otherwise} \end{array} \right. \\ J_{2,t}= & {} I_t(\alpha ')=\left\{ \begin{array}{ll} 1\quad \text {if}\quad r_t<VaR_{t|t-1}(\alpha ')\\ 0\quad \text {otherwise} \end{array} \right. \end{aligned}$$

and \(J_{0,t}=1-J_{1,t}-J_{2,t}=1-I_t(\alpha )\). The \(\{J_{i,t}\}_{i=0}^{2}\) are Bernoulli random variables equal to one with probability \(1-\alpha \), \(\alpha -\alpha '\), and \(\alpha '\), respectively. However, they are clearly not independent since only one J variable may be equal to one at any point in time, \(\sum _{i=0}^2J_{i,t}=1\). We can test the joint hypothesis of the specification of the VaR model using a simple likelihood test. Let \(N_{i,t}=\sum _{t=1}^{T}J_{i,t}\), for \(i=0,1,2\), the count variable associated with each of the Bernoulli variables. This multivariate unconditional coverage test is a likelihood ratio test \(LR_{MUC}\) that the empirical exception frequencies significantly deviate from the theoretical ones

$$\begin{aligned} LR_{MUC}(\alpha ,\alpha ')&= -2\ln \left[ (1-\alpha )^{N_0}(\alpha -\alpha ')^{N_1}(\alpha ')^{N_2}\right] +\\&\quad +2 \ln \left[ \left( \frac{N_0}{T}\right) ^{N_0}\left( \frac{N_1}{T}\right) ^{N_1}\left( \frac{N_2}{T}\right) ^{N_2}\right] {\mathop {\rightarrow }\limits ^{d}}\chi ^2_2 \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garcia-Jorcano, L., Novales, A. A dominance approach for comparing the performance of VaR forecasting models. Comput Stat 35, 1411–1448 (2020). https://doi.org/10.1007/s00180-020-00990-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-020-00990-4

Keywords

Navigation