Optimal model averaging estimator for expectile regressions
Introduction
Quantiles and expectiles are informative location measures of probability distributions for a random variable . Expectiles are natural extensions of the usual mean, just like quantiles generalize the median. Unlike quantiles, which are percentiles of the cumulative distribution of , expectiles incorporate information about the tail expectations of (Efron, 1991). Specially, for , the th quantile of is . Moreover, the th expectile of is the solution to the minimization of asymmetrically weighted mean squared errors. That is where and is an indicator function. It satisfies from which it follows that where Therefore, the th expectile provides information about the expectation of , conditional on in a tail of its distribution. In addition, expectiles, using only iteratively re-weighted least squares, lie in the computational expedience of a sample relative to quantiles (Newey and Powell, 1987). Furthermore, making statistical inference on expectiles is much easier than that on quantiles, and estimating expectiles is more efficient than estimating quantiles since weighted least squares depend on the distance to data points, while empirical quantiles use whether an observation is below or above the variable (Abdous and Remillard, 1995, Daouia et al., 2018).
In risk management, value at risk (VaR) and expected shortfall (ES) are the two most popular risk measures. A risk measure is an estimated amount of capital to be reserved at a given risk level to avoid substantial losses. VaR at can be explained as the maximum potential loss at a given level. However, VaR reports only a quantile and disregards losses beyond the quantile. In addition, VaR is not sub-additive, which contradicts the diversification principle that combining portfolios together can reduce the risk (Acerbi and Tasche, 2002). ES is defined as the conditional expectation of the loss given that the loss exceeds the VaR (Artzner et al., 1999), which can avoid these two drawbacks. However, ES fails to satisfy elicitability (Gneiting and Ranjan, 2011), which is a useful property of risk measures because it makes valid point predictions and forecast performance comparisons possible. Unlike VaR and ES, the expectile is the only risk measure that is both sub-additive and elicitable (Ziegel, 2016). Moreover, expectiles are based on a quadratic loss function, making them more sensitive to the extreme values of a distribution than VaR, which is based on absolute errors. This can be beneficial when measuring potential losses, since investors prefer that a risk measure is sensitive to extreme tail losses. Moreover, expectiles have a clear financial meaning. Following (1), we have (Bellini and Di Bernardino, 2017) provide a transparent financial meaning for it: the th expectile is the amount of money that should be added to a position to produce a pre-specified gain/loss ratio.
An expectile regression estimates the conditional expectiles of a response variable based on a set of regressors. Expectile regressions are widely used in finance, demography, and education; see Taylor (2008), Schnabel and Eilers (2009), and Sobotka et al. (2013). Qualifying the utility of covariates is an essential aspect of an expectile regression. The traditional approach is to first choose one model from a number of candidate models, each containing a different combination of regressors, and make statistical inferences under the selected model. This ignores the uncertainty introduced by a model selection process, and thus, it often underreports the variance in the inferences, as discussed in Hjort and Claeskens (2003) and Wang et al. (2009). Spiegel et al. (2017) discussed several methods for choosing covariates for a semiparametric expectile model. In addition, when the expectile index is very high or very low, expectile estimators tend to be unstable. Thus, this article contributes to the existing literature by providing a model averaging method for estimating expectile regression estimators. It is an integrated progress that avoids ignoring the uncertainty introduced by a model selection process and relying on a single estimator. Furthermore, model averaging frequently yields more accurate predictions of the target variable than model selection.
For linear models, the Mallows-type model averaging can be used after one obtains an unbiased estimator of the squared prediction risk. There is no analytical solution for an expectile regression and the corresponding model averaging estimator is not a linear function of the response variables, therefore we focus on developing a -fold cross-validation model averaging criterion for expectile regressions. Such an extension faces several challenges. First, we must establish the theory for the consistency of estimators in a misspecified model with a diverging number of parameters. This is needed to build the asymptotic optimality of our -fold cross-validation model averaging estimator since all candidate models can be misspecified. Second, the conditional expected expectile prediction error does not have the usual bias–variance decomposition such that the corresponding expectile loss has a complicated form, and asymptotic optimality based on expectile loss of our proposed method is not well established. Third, we also discuss the situation when the true model is one of these candidate models. Then, we prove the consistency of the resulting model averaging estimators. It is not trivial because of the randomness of the selected weight vector.
Cross-validation criteria are widely used to select weights in model averaging. Hansen and Racine (2012) proposed a jackknife model averaging (JMA) estimation that selects the weights by minimizing a cross-validation criterion for linear models. Their finite-sample results suggest that the estimator is preferable to several other model selection and averaging criteria, especially when the errors are heteroscedastic. Zhang et al. (2013) applied JMA in a linear model with a nondiagonal error covariance structure and lagged dependent variables. Lu and Su (2015) and Ando and Li (2017) extended JMA to quantile regressions and high-dimensional generalized linear models, respectively. Cheng and Hansen (2015) considered forecast combination with factor-augmented regressions based on the leave--out cross-validation criterion. Gao et al. (2016) implemented a frequentist model averaging method based on the leave-subject-out cross-validation for longitudinal data models. Liu et al. (2019) and Zhang et al. (2018) developed -fold cross-validation model averaging for copula models and functional linear regression models, respectively.
The remainder of this paper is organized as follows. Section 2 describes the expectile regression and the model averaging estimation. The -fold cross-validation model averaging criterion for expectile regressions is also proposed. The asymptotic optimality of the proposed method and the estimation consistency are discussed in Section 3. In Section 4, we compare the finite sample performance of the -fold cross-validation model averaging estimator with the -fold cross-validation model selection estimator, several information criterion-based model selection and averaging estimators. Section 5 presents real data examples. Section 6 concludes. Technical proofs of the main results are given in the appendix.
Section snippets
Expectile regression model averaging
Inspired from Newey and Powell (1987), we assume that the observable data are generated by the following linear model: where is the response variable, with is of countably infinite dimension, and are unknown parameters, and are independent and identically distributed unobservable error terms that are independent of . Moreover, we treat the regression design as random in this paper. Then, for , the
Asymptotic optimality
In this section, we show that is asymptotically optimal in the sense of minimizing . The following regularity conditions are required for the -fold cross-validation model averaging estimator to achieve asymptotical optimality. All limiting progresses here and throughout this paper hold for . We define the pseudo-true parameter in the th candidate model as and
- C.1
For any given , the errors are independent and identically
Simulation study
In this section, we use Monte Carlo simulations to investigate the performance of our proposed expectile regression model averaging estimators for a finite sample. -fold and -fold cross-validations are recommended by Breiman and Spector (1992) and Kohavi (1995), and we consider three different values: . For each , we generate bootstrap samples and use the method discussed at the end of Section 3.2 to select . For simplicity, the -fold cross-validation model averaging
Expectile forecast of excess stock returns
Forecasting expectiles of excess stock returns is essential in financial risk management. We apply the -fold cross-validation model averaging method to predict the expectiles of excess stock returns. The dataset is from Campbell and Thompson (2008) and Lu and Su (2015). It contains monthly price information from the S&P 500 index from January 1950 to December 2005. There are 672 observations with 13 variables. The response variable is the excess return, and the regressors include
Discussion
In this paper, we proposed the -fold cross-validation model averaging expectile regression estimator. When the candidate models do not contain the true model, our proposed method has been shown to be asymptotically optimal in the sense that it leads to an expectile loss that is asymptotically equivalent to that of the infeasible best-possible model averaging estimator. When the true model is one of the candidate models, the resulting weighted estimator is consistent. Also, a comparison for a
Acknowledgment
Bai’s work was supported by Natural Science Foundation of China (11771268).
References (39)
- et al.
On the coherence of expected shortfall
J. Bank. Financ.
(2002) - et al.
Forecasting with factor-augmented regression: A frequentist model averaging approach
J. Econometrics
(2015) - et al.
Model averaging based on leave-subject-out cross-validation
J. Econometrics
(2016) - et al.
On the predictive risk in misspecified quantile regression
J. Econometrics
(2019) - et al.
Jackknife model averaging
J. Econometrics
(2012) - et al.
Jackknife model averaging for quantile regressions
J. Econometrics
(2015) - et al.
Least squares model averaging by Mallows criterion
J. Econometrics
(2010) - et al.
Model averaging by jackknife criterion in models with dependent data
J. Econometrics
(2013) - et al.
Relating quantiles and expectiles under weighted-symmetry
Ann. Inst. Stat. Math.
(1995) - et al.
A weight-relaxed model averaging approach for high-dimensional generalized linear models
Ann. Stat.
(2017)