Gradient boosting in Markov-switching generalized additive models for location, scale, and shape

doi:10.1016/j.ecosta.2021.04.002

Econometrics and Statistics

Volume 22, April 2022, Pages 3-16

https://doi.org/10.1016/j.ecosta.2021.04.002 Get rights and content

Abstract

Markov-switching generalized additive models for location, scale, and shape constitute a novel class of flexible latent-state time series regression models. In contrast to conventional Markov-switching regression models, they can be used to model different state-dependent parameters of the response distribution — not only the mean, but also variance, skewness, and kurtosis parameters — as potentially smooth functions of a given set of explanatory variables. In addition, the set of possible distributions that can be specified for the response is not limited to the exponential family but additionally includes, for instance, a variety of Box-Cox-transformed, zero-inflated, and mixture distributions. An estimation approach based on the EM algorithm is proposed, where the gradient boosting framework is exploited to prevent overfitting while simultaneously performing variable selection. The feasibility of the suggested approach is assessed in simulation experiments and illustrated in a real-data application, where the conditional distribution of the daily average price of energy in Spain is modeled over time.

Introduction

In recent years, latent-state models — particularly hidden Markov models (HMMs) — have become increasingly popular tools for time series analyses. In many applications, the data at hand follow some pattern within some periods of time but reveal different stochastic properties during other periods (Bartolucci, de Luca, 2003, Zucchini, MacDonald, Langrock, 2016). Typical examples are economic time series, e.g. share returns, oil prices, or bond yields, where the functional relationship between response and explanatory variables differs in periods of high and low economic growth, inflation, or unemployment (Hamilton, 1989). Since their introduction by Goldfeld and Quandt (1973) nearly half a century ago, Markov-switching regression models, i.e. time series regression models where the functional relationship between response and explanatory variables is subject to state-switching controlled by an unobservable Markov chain, have emerged as the method of choice to account for the dynamic patterns described above (Kim, Piger, Startz, 2010, de Souza, Heckman, 2014, de Souza, Heckman, Xu, 2017, Langrock, Kneib, Glennie, Michelot, 2017).

While Markov-switching regression models are typically restricted to modeling the mean of the response (treating the remaining parameters as nuisance and constant across observations), it often appears that other parameters — including variance, skewness, and kurtosis parameters — depend on explanatory variables as well rather than being constant (Rigby and Stasinopoulos, 2005). A motivating example to bear in mind is the daily average price of energy, which we present in detail in Section 5 for Spain as specific case study. When the energy market is in a calm state, which implies relatively low prices alongside a moderate volatility, then the oil price exhibits positive correlation with the mean of the conditional energy price distribution, but the variance is usually constant across observations. In contrast, when the energy market is nervous, which implies relatively high and volatile prices, then also the variance of energy prices is strongly affected by the oil price. This latter possible pattern cannot be addressed with existing Markov-switching regression models. As a consequence, by neglecting the strong heteroskedasticity in the process, price forecasts may severely under- or overestimate the associated uncertainty. This is problematic in scenarios where the interest lies not only in the expected prices, but also in quantiles, e.g. when the costs of forecast errors are asymmetric.

Since their introduction in the seminal work of Rigby and Stasinopoulos (2005) a little more than a decade ago, generalized additive models for location, scale, and shape (GAMLSS) have emerged as the standard framework for distributional regression models, where not only the mean, but also other parameters of the response distribution are modeled as potentially smooth functions of a given set of explanatory variables. Over the last decade, GAMLSS have been applied in a variety of different fields, ranging from the analysis of insurance (Heller et al., 2007) and long-term rainfall data (Villarini et al., 2010) over phenological research (Hudson, 2010) and energy studies (Voudouris et al., 2011) to clinical applications, including long-term survival models (de Castro et al., 2010), childhood obesity (Beyerlein et al., 2008), and measurement errors (Mayr et al., 2017).

GAMLSS are applied primarily to data where it is reasonable to assume that the given observations are independent of each other. This is rarely the case when the data have a time series structure. In fact, when data are collected over time, as e.g. daily energy prices, then the functional relationship between response and explanatory variables may actually change over time. This results in serially correlated residuals due to an under- or overestimation of the true functional relationship. To exploit the flexibility of GAMLSS also within time series settings, we propose a novel class of flexible latent-state time series regression models, which we call Markov-switching GAMLSS. In contrast to conventional Markov-switching regression models, the presented methodology allows us to model different state-dependent parameters of the response distribution as potentially smooth functions of a given set of explanatory variables.

A practical challenge that emerges with the flexibility of Markov-switching GAMLSS is the potentially high dimension of the set of possible model specifications. Each of the parameters of the response distribution varies across two or more states, and each of the associated predictors may involve several explanatory variables, the effect of which may even need to be estimated non-parametrically. Thus, a grid-search approach for model selection, e.g. based on information criteria, is usually practically infeasible. We therefore derive the MS-gamboostLSS algorithm for model fitting, which incorporates the gradient boosting framework into Markov-switching GAMLSS. Gradient boosting emerged from the field of machine learning, but was later adapted to estimate statistical models (see Mayr et al., 2014). The basic idea is to iteratively apply simple regression functions (which are denoted as base-learners) for each potential explanatory variable one-by-one and select in every iteration only the best performing one. The final solution is then an ensemble of the selected base-learner fits including only the most important variables. The design of the algorithm thus leads to automated variable selection and is even feasible for high-dimensional data settings, where the number of variables exceeds the number of observations.

This paper is structured as follows: in Section 2, we introduce the different components of Markov-switching GAMLSS and discuss the underlying dependence assumptions. In Section 3, we derive the MS-gamboostLSS algorithm and give a brief overview of related topics, including model selection. The synergy of HMMs and GAMLSS, which lies at the core of this work, is illustrated in Fig. 1. In Section 4, we assess the suggested approach in simulation experiments, where we consider both linear and non-linear base-learners. In Section 5, we illustrate the proposed methodology in a real-data application, where we model the conditional distribution of the daily average price of energy in Spain over time¹.

Section snippets

Model formulation and dependence structure

In this section, we introduce the model formulation and dependence structure of Markov-switching GAMLSS, which constitute an extension of the closely related but less flexible and in fact nested class of Markov-switching generalized additive models (Markov-switching GAMs, Langrock et al., 2017).

Model fitting

In this section, we derive the MS-gamboostLSS algorithm to estimate the state transition probabilities given by Eq. (1), the initial state probabilities given by Eq. (2), and the state-dependent parameters of the response distribution contained in Eq. (3).

Simulation experiments

To assess the performance of the suggested approach, we present two different simulation experiments, where we consider linear (Section 4.1) and non-linear (Section 4.2) relationships between the explanatory variables and the parameters of the response distribution².

Energy prices in Spain

To illustrate the suggested approach in a real-data setting, we model the conditional distribution of the daily average price of energy in Spain (in c per kWh), ${EnergyPrice}_{t},$ over time. Our aim here is to present a simple case-study that provides some intuition and demonstrates the potential of Markov-switching GAMLSS, which is why we focus on a relatively simple model involving only one explanatory variable, the daily oil price (in EUR per barrel), ${OilPrice}_{t}$ . The data, which are available in

Discussion

In this paper, we introduced Markov-switching GAMLSS as a novel class of flexible latent-state time series regression models that can be used to model different parameters of the response distribution as potentially smooth functions of a given set of explanatory variables. In addition, we demonstrated how gradient boosting can be exploited to avoid overfitting while simultaneously performing variable selection. Limitations of gradient boosting, particularly the fact that the design of the

References (40)

F. Bartolucci et al.
de likelihood-based inference for asymmetric stochastic volatility models
Computational Statistical and Data Analysis
(2003)
M. de Castro et al.
A hands-on approach for fitting long-term survival models under the gamlss framework
Computer Methods and Programs in Biomedicine
(2010)
R. Langrock et al.
Hidden markov models with arbitrary state dwell-time distributions
Computational Statistics and Data Analysis
(2011)
R.T. Rockafellar et al.
Conditional value-at-risk for general loss distributions
Journal of Banking and Finance
(2002)
G. Villarini et al.
Nonstationary modeling of a long record of rainfall and temperature over rome. advances in water resources
Advances in Water Resources
(2010)
V. Voudouris et al.
The ACEGES laboratory for energy policy: exploring the production of crude oil
Energy Policy
(2011)
C. Acerbi et al.
Expected shortfall: a natural coherent alternative to value at risk
Economic Notes
(2002)
T. Adam et al.
Joint modelling of multi-scale animal movement data using hierarchical hidden markov models
Methods in Ecology and Evolution
(2019)
T. Adam et al.
Model-based clustering of time series data: a flexible approach using nonparametric state-switching quantile regression models
Book of Short Papers of the 12th Scientific Meeting on Classification and Data Analysis
(2019)
F. Bartolucci et al.
Latent Markov models for longitudinal data
(2013)

L.E. Baum et al.

A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains

Annals of Mathematical Statistics

(1970)

A. Beyerlein et al.

Alternative regression models to assess increase in childhood BMI

BMC Medical Research Methodology

(2008)

C. Biernacki et al.

Assessing a mixture model for clustering with the integrated completed likelihood

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2013)

G. Celeux et al.

Selecting hidden markov model state number with cross-validated likelihood

Computational Statistics

(2008)

A.P. Dempster et al.

Maximum likelihood from incomplete data via the EM algorithm

Journal of the Royal Statistical Society, Series B

(1977)

P.H.C. Eilers et al.

Flexible smoothing with b-splines and penalties

Statistical Science

(1996)

S.M. Goldfeld et al.

A markov model for switching regressions

Journal of Econometrics

(1973)

J.D. Hamilton

A new approach to the economic analysis of nonstationary time series and the business cycle

Econometrica

(1989)

G.Z. Heller et al.

Mean and dispersion modeling for policy claims costs

Scandinavian Actuarial Journal

(2007)

B. Hofner et al.

gamboostLSS: An r package for model building and variable selection in the GAMLSS framework

Journal of Statistical Software

(2016)

Cited by (2)

The 2nd Special issue on Mixture Models
2022, Econometrics and Statistics
Statistical inference for the nonparametric and semiparametric hidden Markov model via the composite likelihood approach
2023, Science China Mathematics

View full text

Gradient boosting in Markov-switching generalized additive models for location, scale, and shape

Abstract

Introduction

Section snippets

Model formulation and dependence structure

Model fitting

Simulation experiments

Energy prices in Spain

Discussion

Computational Statistical and Data Analysis

Computer Methods and Programs in Biomedicine

Computational Statistics and Data Analysis

Journal of Banking and Finance

Advances in Water Resources

Energy Policy

Expected shortfall: a natural coherent alternative to value at risk

Economic Notes

Joint modelling of multi-scale animal movement data using hierarchical hidden markov models

Methods in Ecology and Evolution

Model-based clustering of time series data: a flexible approach using nonparametric state-switching quantile regression models

Book of Short Papers of the 12th Scientific Meeting on Classification and Data Analysis

Latent Markov models for longitudinal data