Piecewise autoregression for general integer-valued time series

doi:10.1016/j.jspi.2020.07.003

Journal of Statistical Planning and Inference

Volume 211, March 2021, Pages 271-286

https://doi.org/10.1016/j.jspi.2020.07.003 Get rights and content

Abstract

This paper proposes a piecewise autoregression for general integer-valued time series. The conditional mean of the process depends on a parameter which is piecewise constant over time. We derive an inference procedure based on a penalized contrast that is constructed from the Poisson quasi-maximum likelihood of the model. The consistency of the proposed estimator is established. From practical applications, we derive a data-driven procedure based on the slope heuristic to calibrate the penalty term of the contrast; and the implementation is carried out through the dynamic programming algorithm, which leads to a procedure of $O (n^{2})$ time complexity. Some simulation results are provided, as well as the applications to the US recession data and the number of trades in the stock of Technofirst.

Introduction

We consider a $N_{0}$ -valued ( $N_{0} = N \cup {0}$ ) process $Y = {Y_{t}, t \in Z}$ where the conditional mean $λ_{t} = λ_{t} (θ_{t}^{*}) = E (Y_{t} | F_{t - 1})$ is a function (see below) of the whole information $F_{t - 1}$ up to time $t - 1$ and of an unknown parameter $θ_{t}^{*}$ belongs to a compact subset $Θ \subset R^{d}$ ( $d \in N$ ). The inference in the cases where $θ_{t}^{*} = θ^{*}$ is constant or the distribution of $Y_{t} | F_{t - 1}$ is known has been studied by many authors in several directions; see for instance, Fokianos et al. (2009), Fokianos and Tjøstheim, 2011, Fokianos and Tjøstheim, 2012, Davis and Liu (2016), Douc et al. (2017) among others, for some recent works. We consider here a more general setting where $θ_{t}^{*}$ is piecewise constant (multiple change-point problem) and that the distribution of $Y_{t} | F_{t - 1}$ is unknown. We refer to Franke et al. (2012), Kang and Lee (2014), Doukhan and Kengne (2015), Leung et al. (2017) and the references therein for some tests for change-point detection in integer-valued time series.

Let $(Y_{1}, \dots, Y_{n})$ be a trajectory generated as in model (1.1) and assume that the parameter $θ_{t}^{*}$ is piecewise constant. Also, assume that $\exists K^{*} \in N$ , ${\underset{̲}{θ}}^{*} = (θ_{1}^{*}, \dots, θ_{K^{*}}^{*}) \in Θ^{K^{*}}$ and $0 < t_{1}^{*} < \dots < t_{K^{*} - 1}^{*} < n$ such that, ${Y_{t}, t_{j - 1}^{*} < t \leq t_{j}^{*}}$ is generated from the $j$ th stationary regime ; i.e., it is a trajectory of the process ${Y_{t, j}, t \in Z}$ (which are not actually observed for $j = 1, \dots, K^{*}$ , see Section 2 for some details) satisfying: $E (Y_{t, j} | F_{t - 1}) = f (Y_{t - 1, j}, Y_{t - 2, j}, \dots; θ_{j}^{*}), \forall t_{j - 1}^{*} < t \leq t_{j}^{*}$ where $F_{t} = σ (Y_{s, j}, s \leq t, j = 1, \dots, K^{*} - 1)$ is the $σ$ -field generated by the whole information up to time $t$ and $f$ is a measurable non-negative function assumed to be known up to the parameter $θ_{t}^{*}$ . $K^{*}$ is the number of segments (or regimes) of the model; the $j$ th segment corresponds to ${t_{j - 1}^{*} + 1, t_{j - 1}^{*} + 2, \dots, t_{j}^{*}}$ and depends on the parameter $θ_{j}^{*}$ . $t_{1}^{*}, \dots, t_{K^{*} - 1}^{*}$ are the change-point locations; by convention, $t_{0}^{*} = - \infty$ and $t_{K^{*}}^{*} = \infty$ . To ensure the identifiability of the change-point locations, it is reasonable to assume that $θ_{j}^{*} \neq θ_{j + 1}^{*}$ for $j = 1, \dots, K^{*} - 1$ . The case $K^{*} = 1$ corresponds to the model without change. In the sequel, we assume that the random variables $Y_{t}$ , $t \in Z$ have the same (up to the parameter $θ_{t}^{*}$ ) distribution $P$ and denote by $P (\cdot | F_{t - 1})$ the distribution of $Y_{t} | F_{t - 1}$ . For instance, for an INGARCH $(p^{*}, q^{*})$ representation, we have $λ_{t} = α_{0, j}^{*} + \sum_{i = 1}^{q^{*}} α_{i, j}^{*} Y_{t - i} + \sum_{i = 1}^{p^{*}} β_{i, j}^{*} λ_{t - i}, for all t_{j - 1}^{*} < t \leq t_{j}^{*},$ where $α_{0, j}^{*} > 0$ , $α_{1, j}^{*}, \dots, α_{q^{*}, j}^{*}, β_{1, j}^{*}, \dots, β_{p^{*}, j}^{*} \geq 0$ . The parameters vector of the $j$ th regime is $θ_{j}^{*} = (α_{0, j}^{*}, α_{1, j}^{*}, \dots, α_{q^{*}, j}^{*}, β_{1, j}^{*}, \dots, β_{p^{*}, j}^{*})$ . Therefore, $Θ$ is a compact subset of $(0, \infty) \times {[0, \infty)}^{p^{*} + q^{*}}$ such that for all $θ = (α_{0}, α_{1}, \dots, α_{q^{*}}, β_{1}, \dots, β_{p^{*}}) \in Θ$ , $\sum_{i = 1}^{q^{*}} α_{i} + \sum_{i = 1}^{p^{*}} β_{i} < 1$ . For all $j = 1, \dots, K^{*}$ , we assume that $θ_{j}^{*} \in Θ$ ; hence, there exists a sequence of non-negative real numbers ${(ψ_{k} (θ_{j}^{*}))}_{k \geq 0}$ such that $λ_{t} = ψ_{0} (θ_{j}^{*}) + \sum_{k \geq 1} ψ_{k} (θ_{j}^{*}) Y_{t - k}$ . Then, $f (y_{1}, y_{2}, \dots; θ_{j}^{*}) = ψ_{0} (θ_{j}^{*}) + \sum_{k \geq 1} ψ_{k} (θ_{j}^{*}) y_{k}$ for any $(y_{1}, y_{2}, \dots) \in N_{0}^{\infty}$ . For instance, if the distribution $P$ is Poisson, negative binomial or binary, then we get respectively a Poisson, negative binomial, binary INGARCH process; see some examples in Section 4.

Our main focus of interest is the estimation of the unknown parameters $(K^{*}, {(t_{j}^{*})}_{1 \leq j \leq K^{*} - 1}, {(θ_{j}^{*})}_{1 \leq j \leq K^{*}})$ in the model (1.2). This can be viewed as a classical model selection problem. Assume that the observations $Y_{1}, \dots, Y_{n}$ are generated from (1.2). Let $K_{max}$ be the upper bound of the number of segments (note that $K_{max} < n$ ). Denote by $M_{n}$ the set of partitions of $〚 1, n 〛$ into at most $K_{max}$ contiguous segments. Set $m = {T_{1}, \dots, T_{K}}$ a generic element of $K$ segments in $M_{n}$ . Consider the collection ${S_{m}, m \in M_{n}}$ where, for a given $m \in M_{n}$ , $S_{m}$ is the families of sequence $(θ_{t})$ which are piecewise constant on the partition $m$ . Any $ϑ = (θ_{t}) \in S_{m}$ depends on the parameter $\underset{̲}{θ} = (θ_{1}, \dots, θ_{K})$ which is the piecewise values of $θ_{t}$ on each segment. Set $S = \cup_{m \in M_{n}} S_{m}$ . Denote by $ϑ$ a generic element of $S$ , with partition $m$ and parameter $\underset{̲}{θ}$ . $| \underset{̲}{θ} | = K$ denotes the number of the piecewise segments, also called the dimension of $ϑ$ . The true model $ϑ^{*}$ with dimension $K^{*}$ , depends on a partition $m^{*}$ and the parameter ${\underset{̲}{θ}}^{*}$ .

For any $ϑ \in S$ , set $λ_{t}^{ϑ} = \sum_{k = 1}^{K} λ_{t} (θ_{k}) 1_{t \in T_{k}}$ and denote by $P (\cdot | F_{t - 1}, ϑ)$ the distribution of $Y_{t} | F_{t - 1}, ϑ$ ; let $p (\cdot | F_{t - 1}, ϑ) = p (\cdot; λ_{t}^{ϑ})$ be the probability density function of this distribution. For $ϑ \in S$ , let $P_{n, ϑ}$ be the conditional distribution of $(Y_{1}, \dots, Y_{n}) | F_{n - 1}, ϑ$ . We consider the log-likelihood contrast conditioned to $Y_{0}, Y_{- 1}, \dots$ : $\forall ϑ \in S$ , $γ_{n} (ϑ) ≔ γ_{n} (P_{n, ϑ}) = - log P_{n, ϑ} (Y_{1}, \dots, Y_{n}) = - \sum_{t = 1}^{n} log p (Y_{t} | F_{t - 1}, ϑ) = - \sum_{t = 1}^{n} log p (Y_{t}; λ_{t}^{ϑ}) .$ Thus, the minimal contrast estimator ${\hat{ϑ}}_{m}$ of $ϑ^{*}$ on the collection $S_{m}$ is obtained by minimizing the contrast $γ_{n} (ϑ)$ over $ϑ \in S_{m}$ ; that is, ${\hat{ϑ}}_{m} = \underset{ϑ \in S_{m}}{argmin} γ_{n} (ϑ)$ . The main approaches of the model selection procedures take into account the model complexity and select the estimator ${\hat{ϑ}}_{m_{n}}$ such that, $m_{n}$ minimizes the penalized criterion ${crit}_{n} (m) = γ_{n} ({\hat{ϑ}}_{m}) + {pen}_{n} (m), for all m \in M_{n}$ where ${pen}_{n} : M_{n} \to R_{+}$ is a penalty function, possibly data-dependent. We now address the following issues.

(i) Semi-parametric setting. Kashikar et al. (2013) have carried out structural breaks in Poisson INAR process from the MCMC and Gibbs sampling approach. Cleynen and Lebarbier, 2014, Cleynen and Lebarbier, 2017 have recently considered the change-point type problem (1.2) with i.i.d. observations; in their works, the distribution $P$ is assumed to be known and could be Poisson, Negative binomial or belongs to the exponential family distribution. From the practical viewpoint, we consider the case where $P$ is unknown and deal with the Poisson quasi-likelihood (see for instance, Ahmad and Francq, 2016). So in the sequel, $γ_{n}$ is the Poisson quasi-likelihood contrast and ${\hat{ϑ}}_{m}$ is the Poisson quasi-maximum likelihood estimator (PQMLE).

(ii) Multiple change-point problem from a non-asymptotic point of view. This question is tacked by model selection approach. Numerous works have been devoted to this issue; see among others, Lebarbier (2005), Arlot and Massart (2009), Cleynen and Lebarbier, 2014, Cleynen and Lebarbier, 2017 and Arlot et al. (2016).

In this (quasi)log-likelihood framework, it is more usual to consider the Kullback–Leibler risk. For any $ϑ \in S$ , the Kullback–Leibler divergence between $P_{n, ϑ^{*}}$ and $P_{n, ϑ}$ is $K L (ϑ^{*}, ϑ) ≔ K L (P_{n, ϑ^{*}}, P_{n, ϑ}) = E [log \frac{P_{n, ϑ^{*}} (Y_{1}, \dots, Y_{n})}{P_{n, ϑ} (Y_{1}, \dots, Y_{n})}] = \sum_{t = 1}^{n} E [log \frac{p (Y_{t} | F_{t - 1}, ϑ^{*})}{p (Y_{t} | F_{t - 1}, ϑ)}] = \sum_{t = 1}^{n} E [log p (Y_{t}; λ_{t}^{ϑ^{*}})] - \sum_{t = 1}^{n} E [log p (Y_{t}; λ_{t}^{ϑ})],$ where $E$ denotes the expectation with respect to the true distribution of the observations. In the case where $γ_{n}$ is the likelihood contrast, we get $K L (ϑ^{*}, ϑ) = E [γ_{n} (ϑ) - γ_{n} (ϑ^{*})]$ . The “ideal” partition $m (ϑ^{*})$ (the one whose estimator is closest to $ϑ^{*}$ according to the Kullback–Leibler risk) satisfying: $m (ϑ^{*}) = \underset{m \in M_{n}}{argmin} E [K L (ϑ^{*}, {\hat{ϑ}}_{m})] .$ The corresponding estimator ${\hat{ϑ}}_{m (ϑ^{*})}$ , called the oracle, depends on the true sample distribution, and cannot be computed in practice. The goal is to calibrate the penalty term, such that the segmentation $\hat{m}$ provides an estimator ${\hat{ϑ}}_{\hat{m}}$ where the risk of ${\hat{ϑ}}_{\hat{m}}$ is close as possible to the risk of the oracle, namely such that $E [K L (ϑ^{*}, {\hat{ϑ}}_{\hat{m}})] \leq C E [K L (ϑ^{*}, {\hat{ϑ}}_{m (ϑ^{*})})]$ for a non-negative constant $C$ , expected close to 1. This issue is addressed in the above mentioned papers, and the results obtained are heavily relied on the independence of the observations. In our setting here, it seems to be a more difficult task. But, we believe that the coupling method can be used as in Lerasle (2011) to overcome this difficulty. We leave this question as the topic of a different research project.

(iii) Multiple change-point problem from an asymptotic point of view. The aim here is to consistently estimate the parameters of the change-point model. This issue has been addressed by several authors using the classical contrast/criteria optimization or binary/sequential segmentation/estimation; see for instance, Bai and Perron (1998), Davis et al. (2008), Harchaoui and Lévy-Leduc (2010), Bardet et al. (2012), Davis and Yau (2013), Davis et al. (2016), Ma and Yau (2016), Yau and Zhao (2016), Inclán and Tiao (1994), Bai (1997), Fryzlewicz and Subba Rao (2014), Fryzlewicz (2014), among others, for some advanced towards this issue. These works and many other papers in the literature on the asymptotic study of multiple change-point problem are often focused on continuous valued time series; moreover, the case of a large class of semi-parametric model for discrete-valued time series (such as those discussed earlier) have not yet addressed.

We consider (1.2) and derive a penalized contrast of type (1.3). We assume that there exists a partition ${\underset{̲}{τ}}^{*}$ of $[0, 1]$ such that $[{\underset{̲}{τ}}^{*} n] = m^{*}$ , where $[{\underset{̲}{τ}}^{*} n]$ is the corresponding partition of $〚 1, n 〛$ obtained from ${\underset{̲}{τ}}^{*}$ . We provide sufficient conditions on the penalty ${pen}_{n}$ , for which the estimators $\hat{m}$ and ${\hat{ϑ}}_{\hat{m}}$ are consistent; that is: $(| \hat{m} |, \frac{\hat{m}}{n}, {\hat{ϑ}}_{\hat{m}}) \begin{matrix} \overset{P}{⟶} \\ n \to \infty \end{matrix} (K^{*}, {\underset{̲}{τ}}^{*}, ϑ^{*})$ where $\frac{\hat{m}}{n}$ is the corresponding partition of $[0, 1]$ obtained from $\hat{m}$ .

The paper is organized as follows. In Section 2, we set some notations, assumptions and define the Poisson QMLE. In Section 3, we derive the estimation procedure and provide the main results. Some simulation results are displayed in Section 4 whereas Section 5 focuses on applications on the US recession data and the daily number of trades in the stock of Technofirst. Section 6 is devoted to a summary and conclusion. The Supporting Information provides the proofs of the main results.

Section snippets

Notations and Poisson QMLE

We set the following classical Lipschitz-type condition on the function $f$ .

Assumption A $i (Θ)$ $(i = 0, 1, 2)$

For any $y \in N_{0}^{N}$ , the function $θ \mapsto f (y; θ)$ is $i$ times continuously differentiable on $Θ$ and there exists a sequence of non-negative real numbers ${(α_{k}^{(i)})}_{k \geq 1}$ satisfying $\sum_{k = 1}^{\infty} α_{k}^{(0)} < 1$ (or $\sum_{k = 1}^{\infty} α_{k}^{(i)} < \infty$ for $i = 1, 2$ ); such that for any $y, y^{'} \in N_{0}^{N}$ , $sup_{θ \in Θ} ‖ \frac{\partial^{i} f (y; θ)}{\partial θ^{i}} - \frac{\partial^{i} f (y^{'}; θ)}{\partial θ^{i}} ‖ \leq \sum_{k = 1}^{\infty} α_{k}^{(i)} | y_{k} - y_{k}^{'} |;$ where $‖ \cdot ‖$ denotes any vector, matrix norm.

In the whole paper, it is assumed that for $j = 1, \dots, K^{*}$ , there exists a stationary and ergodic process ${Y_{t}$

Estimation procedure and main results

In this section, we carry out the estimation of the number of breaks $K^{*} - 1$ and the instants of breaks ${\underset{̲}{t}}^{*}$ by using a penalized contrast. Some asymptotic studies are also reported.

Some simulations results

In this section, we implement the procedure on the R software (developed by the CRAN project). We will restrict our attention to the estimation of the vector $(K^{*}, {\underset{̲}{t}}^{*})$ ; i.e, the number of segments $K^{*}$ and the instants of breaks ${\underset{̲}{t}}^{*}$ . For the performances of the estimator of the parameter ${\underset{̲}{θ}}^{*}$ , we refer to the works of Ahmad and Francq (2016). For each process, we generate $100$ replications following the scenarios considered. The estimated number of segments is computed by using the $Q L I K$ criteria

Real data application

We apply our change-point procedure to two examples of real data series. To compute the estimator ${\hat{K}}_{n}$ , the ${\hat{κ}}_{n}$ -penalty is used with $u_{n} = [{(log (n))}^{δ}]$ (where $3 ∕ 2 \leq δ \leq 2$ ) and $K_{max} = 15$ .

Summary and conclusion

This paper focuses on the multiple change-point problem in a general class of integer-valued time series. A penalized contrast estimator based on the Poisson quasi-maximum likelihood of the model is proposed. The theoretical study establishes the consistency of the proposed estimator. A data-driven procedure based on the slope heuristic is also proposed to calibrate the penalty term of the contrast. The simulation study based on three penalty procedures (BIC, $n^{1 ∕ 3}$ and slope heuristic) displays

Acknowledgments

The authors are grateful to the Executive Editors, Co-Editors and the two anonymous Referees for many relevant suggestions and comments which helped to improve the contents of this article.

References (34)

DavisR.A. et al.
On consistency of minimum description length model selection for piecewise autoregressions
J. Econometrics
(2016)
DoukhanP. et al.
On weak dependence conditions for Poisson autoregressions
Statist. Probab. Lett.
(2012)
DoukhanP. et al.
Correction to on weak dependence conditions for Poisson autoregressions
Statist. Probab. Lett.
(2013)
FokianosK. et al.
Log-linear Poisson autoregression
J. Multivariate Anal.
(2011)
HudecováŠ.
Structural changes in autoregressive models for binary time series
J. Statist. Plann. Inference
(2013)
LebarbierE.
Detecting multiple change-points in the mean of Gaussian process by model selection
Signal Process.
(2005)
AhmadA. et al.
Poisson QMLE of count time series models
J. Time Series Anal.
(2016)
ArlotS. et al.
A kernel multiple change-point algorithm via model selection
(2016)
ArlotS. et al.
Data-driven calibration of penalties for least-squares regression
J. Mach. Learn. Res.
(2009)
BaiJ.
Estimating multiple breaks one at a time
Econometric Theory
(1997)

BaiJ. et al.

Estimating and testing linear models with multiple structural changes

Econometrica

(1998)

BardetJ.M. et al.

Multiple breaks detection in general causal time series using penalized quasi-likelihood

Electron. J. Stat.

(2012)

BaudryJ.P. et al.

Slope Heuristics: Overview and Implementation RR-INRIA $n^{o}$ 7223

(2010)

CleynenA. et al.

Segmentation of the Poisson and negative binomial rate models: a penalized estimator

ESAIM Probab. Stat.

(2014)

CleynenA. et al.

Model selection for the segmentation of multiparameter exponential family distributions

Electron. J. Stat.

(2017)

DavisR.A. et al.

Break detection for a class of nonlinear time series models

J. Time Series Anal.

(2008)

DavisR.A. et al.

Theory and inference for a class of observation-driven models with application to time series of counts

Statist. Sinica

(2016)

Cited by (17)

Deep learning for ψ-weakly dependent processes
2024, Journal of Statistical Planning and Inference
In this paper, we perform deep neural networks for learning stationary $ψ$ -weakly dependent processes. Such weak-dependence property includes a class of weak dependence conditions such as mixing, association $\dots$ and the setting considered here covers many commonly used situations such as: regression estimation, time series prediction, time series classification $\dots$ The consistency of the empirical risk minimization algorithm in the class of deep neural networks predictors is established. We achieve the generalization bound and obtain an asymptotic learning rate, which is less than $O (n^{- 1 / α})$ , for all $α > 2$ . A bound of the excess risk, for a wide class of target functions, is also derived. Applications to binary time series classification and prediction in affine causal models with exogenous covariates are carried out. Some simulation results are provided, as well as an application to the US recession data.
Change-points analysis for generalized integer-valued autoregressive model via minimum description length principle
2024, Applied Mathematical Modelling
This article considers the problem of modeling a class of count time series with multiple change-points using segmented generalized integer-valued autoregressive (S-GINAR) processes. The minimum description length principle (MDL) is applied to study the statistical inference for the S-GINAR model, and the consistency results of the MDL model selection procedure are established respectively under the condition of known and unknown number of change-points. To find the “best” combination of the number of change-points, the locations of change-points, the order of each segment and its parameters, a genetic algorithm with simulated annealing is implemented to solve this difficult optimization problem. In particular, the simulated annealing process makes up for the precocious problem of the traditional genetic algorithm. Numerical results from simulation experiments and three examples of real data analysis show that the procedure has excellent empirical properties.
Change-points analysis for generalized integer-valued autoregressive model via minimum description length principle
2023, arXiv
Nonparametric data segmentation in multivariate time series via joint characteristic functions
2023, arXiv
Deep learning for ψ-weakly dependent processes
2023, arXiv
Density Power Divergence Estimator for General Integer-Valued Time Series with Exogenous Covariates
2023, Communications in Mathematics and Statistics

View all citing articles on Scopus

¹: Supported by the Institute for advanced studies - IAS (CY Cergy Paris Université, France), the MME-DII center of excellence (ANR-11-LABEX-0023-01) and by the CEA-MITIC (Université Gaston Berger, Sénégal).

²: Developed within the ANR BREAKRISK, France : ANR-17-CE26-0001-01.

View full text

ReviewPiecewise autoregression for general integer-valued time series

Abstract

Introduction

Section snippets

Notations and Poisson QMLE

Estimation procedure and main results

Some simulations results

Real data application

Summary and conclusion

Acknowledgments

J. Econometrics

Statist. Probab. Lett.

Statist. Probab. Lett.

J. Multivariate Anal.

J. Statist. Plann. Inference

Signal Process.

Poisson QMLE of count time series models

J. Time Series Anal.

A kernel multiple change-point algorithm via model selection

Data-driven calibration of penalties for least-squares regression

J. Mach. Learn. Res.

Estimating multiple breaks one at a time

Econometric Theory

Estimating and testing linear models with multiple structural changes

Econometrica

Multiple breaks detection in general causal time series using penalized quasi-likelihood

Electron. J. Stat.

Slope Heuristics: Overview and Implementation RR-INRIA no7223

Segmentation of the Poisson and negative binomial rate models: a penalized estimator

ESAIM Probab. Stat.

Model selection for the segmentation of multiparameter exponential family distributions

Electron. J. Stat.

Break detection for a class of nonlinear time series models

J. Time Series Anal.

Theory and inference for a class of observation-driven models with application to time series of counts

Statist. Sinica

Review
Piecewise autoregression for general integer-valued time series

Slope Heuristics: Overview and Implementation RR-INRIA $n^{o}$ 7223