Elsevier

Econometrics and Statistics

Volume 22, April 2022, Pages 3-16
Econometrics and Statistics

Gradient boosting in Markov-switching generalized additive models for location, scale, and shape

https://doi.org/10.1016/j.ecosta.2021.04.002Get rights and content

Abstract

Markov-switching generalized additive models for location, scale, and shape constitute a novel class of flexible latent-state time series regression models. In contrast to conventional Markov-switching regression models, they can be used to model different state-dependent parameters of the response distribution — not only the mean, but also variance, skewness, and kurtosis parameters — as potentially smooth functions of a given set of explanatory variables. In addition, the set of possible distributions that can be specified for the response is not limited to the exponential family but additionally includes, for instance, a variety of Box-Cox-transformed, zero-inflated, and mixture distributions. An estimation approach based on the EM algorithm is proposed, where the gradient boosting framework is exploited to prevent overfitting while simultaneously performing variable selection. The feasibility of the suggested approach is assessed in simulation experiments and illustrated in a real-data application, where the conditional distribution of the daily average price of energy in Spain is modeled over time.

Introduction

In recent years, latent-state models — particularly hidden Markov models (HMMs) — have become increasingly popular tools for time series analyses. In many applications, the data at hand follow some pattern within some periods of time but reveal different stochastic properties during other periods (Bartolucci, de Luca, 2003, Zucchini, MacDonald, Langrock, 2016). Typical examples are economic time series, e.g. share returns, oil prices, or bond yields, where the functional relationship between response and explanatory variables differs in periods of high and low economic growth, inflation, or unemployment (Hamilton, 1989). Since their introduction by Goldfeld and Quandt (1973) nearly half a century ago, Markov-switching regression models, i.e. time series regression models where the functional relationship between response and explanatory variables is subject to state-switching controlled by an unobservable Markov chain, have emerged as the method of choice to account for the dynamic patterns described above (Kim, Piger, Startz, 2010, de Souza, Heckman, 2014, de Souza, Heckman, Xu, 2017, Langrock, Kneib, Glennie, Michelot, 2017).

While Markov-switching regression models are typically restricted to modeling the mean of the response (treating the remaining parameters as nuisance and constant across observations), it often appears that other parameters — including variance, skewness, and kurtosis parameters — depend on explanatory variables as well rather than being constant (Rigby and Stasinopoulos, 2005). A motivating example to bear in mind is the daily average price of energy, which we present in detail in Section 5 for Spain as specific case study. When the energy market is in a calm state, which implies relatively low prices alongside a moderate volatility, then the oil price exhibits positive correlation with the mean of the conditional energy price distribution, but the variance is usually constant across observations. In contrast, when the energy market is nervous, which implies relatively high and volatile prices, then also the variance of energy prices is strongly affected by the oil price. This latter possible pattern cannot be addressed with existing Markov-switching regression models. As a consequence, by neglecting the strong heteroskedasticity in the process, price forecasts may severely under- or overestimate the associated uncertainty. This is problematic in scenarios where the interest lies not only in the expected prices, but also in quantiles, e.g. when the costs of forecast errors are asymmetric.

Since their introduction in the seminal work of Rigby and Stasinopoulos (2005) a little more than a decade ago, generalized additive models for location, scale, and shape (GAMLSS) have emerged as the standard framework for distributional regression models, where not only the mean, but also other parameters of the response distribution are modeled as potentially smooth functions of a given set of explanatory variables. Over the last decade, GAMLSS have been applied in a variety of different fields, ranging from the analysis of insurance (Heller et al., 2007) and long-term rainfall data (Villarini et al., 2010) over phenological research (Hudson, 2010) and energy studies (Voudouris et al., 2011) to clinical applications, including long-term survival models (de Castro et al., 2010), childhood obesity (Beyerlein et al., 2008), and measurement errors (Mayr et al., 2017).

GAMLSS are applied primarily to data where it is reasonable to assume that the given observations are independent of each other. This is rarely the case when the data have a time series structure. In fact, when data are collected over time, as e.g. daily energy prices, then the functional relationship between response and explanatory variables may actually change over time. This results in serially correlated residuals due to an under- or overestimation of the true functional relationship. To exploit the flexibility of GAMLSS also within time series settings, we propose a novel class of flexible latent-state time series regression models, which we call Markov-switching GAMLSS. In contrast to conventional Markov-switching regression models, the presented methodology allows us to model different state-dependent parameters of the response distribution as potentially smooth functions of a given set of explanatory variables.

A practical challenge that emerges with the flexibility of Markov-switching GAMLSS is the potentially high dimension of the set of possible model specifications. Each of the parameters of the response distribution varies across two or more states, and each of the associated predictors may involve several explanatory variables, the effect of which may even need to be estimated non-parametrically. Thus, a grid-search approach for model selection, e.g. based on information criteria, is usually practically infeasible. We therefore derive the MS-gamboostLSS algorithm for model fitting, which incorporates the gradient boosting framework into Markov-switching GAMLSS. Gradient boosting emerged from the field of machine learning, but was later adapted to estimate statistical models (see Mayr et al., 2014). The basic idea is to iteratively apply simple regression functions (which are denoted as base-learners) for each potential explanatory variable one-by-one and select in every iteration only the best performing one. The final solution is then an ensemble of the selected base-learner fits including only the most important variables. The design of the algorithm thus leads to automated variable selection and is even feasible for high-dimensional data settings, where the number of variables exceeds the number of observations.

This paper is structured as follows: in Section 2, we introduce the different components of Markov-switching GAMLSS and discuss the underlying dependence assumptions. In Section 3, we derive the MS-gamboostLSS algorithm and give a brief overview of related topics, including model selection. The synergy of HMMs and GAMLSS, which lies at the core of this work, is illustrated in Fig. 1. In Section 4, we assess the suggested approach in simulation experiments, where we consider both linear and non-linear base-learners. In Section 5, we illustrate the proposed methodology in a real-data application, where we model the conditional distribution of the daily average price of energy in Spain over time1.

Section snippets

Model formulation and dependence structure

In this section, we introduce the model formulation and dependence structure of Markov-switching GAMLSS, which constitute an extension of the closely related but less flexible and in fact nested class of Markov-switching generalized additive models (Markov-switching GAMs, Langrock et al., 2017).

Model fitting

In this section, we derive the MS-gamboostLSS algorithm to estimate the state transition probabilities given by Eq. (1), the initial state probabilities given by Eq. (2), and the state-dependent parameters of the response distribution contained in Eq. (3).

Simulation experiments

To assess the performance of the suggested approach, we present two different simulation experiments, where we consider linear (Section 4.1) and non-linear (Section 4.2) relationships between the explanatory variables and the parameters of the response distribution2.

Energy prices in Spain

To illustrate the suggested approach in a real-data setting, we model the conditional distribution of the daily average price of energy in Spain (in c per kWh), EnergyPricet, over time. Our aim here is to present a simple case-study that provides some intuition and demonstrates the potential of Markov-switching GAMLSS, which is why we focus on a relatively simple model involving only one explanatory variable, the daily oil price (in EUR per barrel), OilPricet. The data, which are available in

Discussion

In this paper, we introduced Markov-switching GAMLSS as a novel class of flexible latent-state time series regression models that can be used to model different parameters of the response distribution as potentially smooth functions of a given set of explanatory variables. In addition, we demonstrated how gradient boosting can be exploited to avoid overfitting while simultaneously performing variable selection. Limitations of gradient boosting, particularly the fact that the design of the

References (40)

  • L.E. Baum et al.

    A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains

    Annals of Mathematical Statistics

    (1970)
  • A. Beyerlein et al.

    Alternative regression models to assess increase in childhood BMI

    BMC Medical Research Methodology

    (2008)
  • C. Biernacki et al.

    Assessing a mixture model for clustering with the integrated completed likelihood

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2013)
  • G. Celeux et al.

    Selecting hidden markov model state number with cross-validated likelihood

    Computational Statistics

    (2008)
  • A.P. Dempster et al.

    Maximum likelihood from incomplete data via the EM algorithm

    Journal of the Royal Statistical Society, Series B

    (1977)
  • P.H.C. Eilers et al.

    Flexible smoothing with b-splines and penalties

    Statistical Science

    (1996)
  • S.M. Goldfeld et al.

    A markov model for switching regressions

    Journal of Econometrics

    (1973)
  • J.D. Hamilton

    A new approach to the economic analysis of nonstationary time series and the business cycle

    Econometrica

    (1989)
  • G.Z. Heller et al.

    Mean and dispersion modeling for policy claims costs

    Scandinavian Actuarial Journal

    (2007)
  • B. Hofner et al.

    gamboostLSS: An r package for model building and variable selection in the GAMLSS framework

    Journal of Statistical Software

    (2016)
  • View full text