Abstract
In this article, we consider the estimation of the structural change point in the nonparametric model with dependent observations. We introduce a maximum-CUSUM-estimation procedure, where the CUSUM statistic is constructed based on the sum-of-squares aggregation of the difference of the two Nadaraya-Watson estimates using the observations before and after a specific time point. Under some mild conditions, we prove that the statistic tends to zero almost surely if there is no change, and is larger than a threshold asymptotically almost surely otherwise, which helps us to obtain a threshold-detection strategy. Furthermore, we demonstrate the strong consistency of the change point estimator. In the simulation, we discuss the selection of the bandwidth and the threshold used in the estimation, and show the robustness of our method in the long-memory scenario. We implement our method to the data of Nasdaq 100 index and find that the relation between the realized volatility and the return exhibits several structural changes in 2007–2009.
Similar content being viewed by others
References
Aue A, Rice G, Sönmez O (2018) Detecting and dating structural breaks in functional data without dimension reduction. J R Stat Soc Ser B 80(3):509–529
Bai J (1997) Estimating multiple breaks one at a time. Econom Theory 13(3):315–352
Bandi FM, Renò R (2012) Time-varying leverage effects. J Econom 169(1):94–113
Barigozzi M, Cho H, Fryzlewicz P (2018) Simultaneous multiple change-point and factor analysis for high-dimensional time series. J Econom 206(1):187–225
Basrak B, Davis RA, Mikosch T (2002) Regular variation of GARCH processes. Stoch Process Appl 99(1):95–115
Bollerslev T, Zhou H (2006) Volatility puzzles: a simple framework for gauging return-volatility regressions. J Econom 131(1–2):123–150
Bosq D (1998) Nonparametric statistics for stochastic processes: estimation and prediction, 2nd edn. Springer, New York
Braun JV, Müller HG (1998) Statistical methods for DNA sequence segmentation. Stat Sci 13(2):142–162
Brown RL, Durbin J, Evans JM (1975) Techniques for testing the constancy of regression relationships over time. J R Stat Soc Ser B 37(2):149–192
Chen F, Nkurunziza S (2017) On estimation of the change points in multivariate regression models with structural changes. Commun Stat Theory Methods 46(14):7157–7173
Cho H (2016) Change-point detection in panel data via double CUSUM statistic. Electron J Stat 10(2):2000–2038
Cho H, Fryzlewicz P (2012) Multiscale and multilevel technique for consistent segmentation of nonstationary time series. Stat Sin 22:207–229
Cho H, Fryzlewicz P (2015) Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J R Stat Soc Ser B 77(2):475–507
Choi JY, Lee MJ (2017) Regression discontinuity: review with extensions. Stat Pap 58(4):1217–1246
Dette H, Kokot K and Aue A (2017) Functional data analysis in the Banach space of continuous functions. Preprint on https://arxiv.org/abs/1710.07781
Delgado MA, Hidalgo J (2000) Nonparametric inference on structural breaks. J Econom 96(1):113–144
Eichinger B, Kirch C (2018) A MOSUM procedure for the estimation of multiple random change points. Bernoulli 24(1):526–564
Enikeeva F, Harchaoui Z (2019) High-dimensional change-point detection under sparse alternatives. Ann Stat 47(4):2051–2079
Fryzlewicz P (2014) Wild binary segmentation for multiple change-point detection. Ann Stat 42(6):2243–2281
Gurevich G, Vexler A (2005) Change point problems in the model of logistic regression. J Stat Plan Inference 131(2):313–331
Härdle W, Müller M, Sperlich S, Werwatz A (2004) Nonparametric and semiparametric models. Springer, Berlin
Hosking JRM (1981) Fractional differencing. Biometrika 68(1):165–176
Huh J, Park BU (2004) Detection of a change point with local polynomial fits for the random design case. Austral N Z J Stat 46(3):425–441
Hušková M, Maciak M (2017) Discontinuities in robust nonparametric regression with \(\alpha \)-mixing dependence. J Nonparametric Stat 29(2):447–475
Hušková M, Steinebach J (2002) Asymptotic tests for gradual changes. Statistics and Risk Modeling 20(1–4):137–152
Jin X (2017) Time-varying return-volatility relation in international stock markets. Int Rev Econ Financ 51:157–173
Johannes J, Rao SS (2011) Nonparametric estimation for dependent data. J Nonparametric Stat 23(3):661–681
Kaul A, Jandhyala VK, Fotopoulos SB (2019) An efficient two step algorithm for high dimensional change point regression models without grid search. J Mach Learn Res 20(111):1–40
Killick R, Fearnhead P, Eckley IA (2012) Optimal detection of changepoints with a linear computational cost. J Am Stat Assoc 107(500):1590–1598
Masry E (1996) Multivariate local polynomial regression for time series: uniform strong consistency and rates. J Time Ser Anal 17(6):571–599
Mohr M, Neumeyer N (2019) Consistent nonparametric change point detection combining CUSUM and marked empirical processes. Preprint on arXiv:1901.08491
Mohr M, Selk L (2020) Estimating change points in nonparametric time series regression models. Statistical Papers, online https://doi.org/10.1007/s00362-020-01162-8
Müller HG (1992) Change-points in nonparametric regression analysis. Ann Stat 20(2):737–761
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142
Ploberger W, Krämer W (1992) The CUSUM test with OLS residuals. Econometrica 60(2):271–285
Qiu P, Zi X, Zou C (2018) Nonparametric dynamic curve monitoring. Technomerics 60(3):386–397
Su L, Xiao Z (2008) Testing structural change in time-series nonparametric regression models. Stat Interface 1(2):347–366
Venkatraman ES (1992) Consistency results in multiple change-point problems. Technical Report 24. Department of Statistics, Stanford University, Stanford
Wang L (2008) Change-point estimation in long memory nonparametric models with applications. Commun Stat Simul Comput 37(1):48–61
Wang Y, Wang Z, Zi X (2019) Rank-based multiple change-point detecion. Commun Stat Theory Methods. https://doi.org/10.1080/03610926.2019.1589515
Watson GS (1964) Smooth regression analysis. Sankhyā 26(4):359–372
Wu JS, Chu CK (1993) Kernel-type estimators of jump points and values of a regression function. Ann Stat 21(3):1545–1566
Wu X, Zhang S, Zhang Q, Ma S (2016) Detecting change point in linear regression using jackknife empirical likelihood. Stat Interface 9(1):113–122
Xu M, Wu Y, Jin B (2019) Detection of a change-point in variance by a weighted sum of powers of variances test. J Appl Stat 46(4):664–679
Zhang T, Lavitas L (2018) Unsupervised self-normalized change-point testing for time series. J Am Stat Assoc 113(522):637–648
Zou C, Yin G, Feng L, Wang Z (2014) Nonparametric maximum likelihhood approach to multiple change-point problems. Ann Stat 42(3):970–1002
Acknowledgements
The authors thank the professor Cai-Ya Zhang from ZUCC, senior brothers and many schoolmates for the vital comments and suggestions. Specially, we appreciate the editor and reviewers for their comments and suggestions to our research, which improve our work significantly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Proofs
Appendix: Proofs
In this section, we provide the detailed proofs of the theoretical results in Sect. 3. Before proving the main theorems, we state and prove some lemmas.
The following lemma plays a crucial role in deriving some uniform bounds of an \(\alpha \)-mixing process.
Lemma 1
(Theorem 1.3 of Bosq (1998)) Let \((X_t , t\in {\mathbb {Z}})\) be a zero-mean real-valued process such that \(\sup \limits _{1\le t\le n}\Vert X_t\Vert _\infty \le b\). Let \(S_n=\sum _{t=1}^nX_t\). Then
-
(i)
For each integer \(q\in [1,\frac{n}{2}]\) and each \(\varepsilon >0\),
$$\begin{aligned} \mathsf P (|S_n|>n\varepsilon )\le 4\exp \left( -\frac{\varepsilon ^2}{8b^2}q\right) +22\left( 1+\frac{4b}{\varepsilon }\right) ^{1/2}q\alpha \left( \left\lfloor \frac{n}{2q}\right\rfloor \right) . \end{aligned}$$ -
(ii)
For each integer \(q\in [1,\frac{n}{2}]\) and each \(\varepsilon >0\),
$$\begin{aligned} \mathsf P (|S_n|>n\varepsilon )\le 4\exp \left( -\frac{\varepsilon ^2}{8v^2(q)}q\right) +22\left( 1+\frac{4b}{\varepsilon }\right) ^{1/2}q\alpha \left( \left\lfloor \frac{n}{2q}\right\rfloor \right) \end{aligned}$$with \(v^2(q)=\frac{2}{p^2}\sigma ^2(q)+\frac{b\varepsilon }{2}\), \(p=\frac{n}{2q}\),
$$\begin{aligned} \sigma ^2(q)= & {} \max _{0\le j\le 2q-1}\mathsf{E }\left\{ (\lfloor jp\rfloor +1-jp)X_{\lfloor jp\rfloor +1}+X_{\lfloor jp\rfloor +2}+\right. \\&\left. \cdots +X_{\lfloor (j+1)p\rfloor }+((j+1)p-\lfloor (j+1)p\rfloor ) X_{\lfloor (j+1)p+1\rfloor }\right\} ^{2}. \end{aligned}$$
The following lemma demonstrates the bounds of the auto-covariances of \(K_{h_n}(x-X_i)\) and \(Y_i I_{(|Y_i|\le T)} K_{h_n}(x-X_i)\), that is, a truncated version of \(Y_iK_{h_n}(x-X_i)\), which facilitates the application of Lemma 1 to the N–W estimator.
Lemma 2
Suppose \(\{X_t,Y_t\}_{t=1}^n\) is an \(\alpha \)-mixing process, and Assumptions 4–6 are satisfied. Then we have
-
(i)
$$\begin{aligned} \mathsf Var \left( K_{h_n}(x-X_i)\right) \le c_1 h_n^{-1}, \end{aligned}$$(6.1)
and
$$\begin{aligned}&\left| \mathsf Cov \left( K_{h_n}(x-X_i),K_{h_n}(x-X_j)\right) \right| \nonumber \\&\quad \le \left\{ \begin{aligned}&c_2\min \{h_n^{-1+q_F},h_n^{-2}|i-j|^{-\gamma }\}, \{X_t,Y_t\}_{t=1}^n \in PSM,\\&c_2\min \{h_n^{-1+q_F},h_n^{-2}\rho ^{|i-j|}\}\ \ \ \ , \{X_t,Y_t\}_{t=1}^n \in GSM\\ \end{aligned} \right. \end{aligned}$$(6.2)for any \(1\le i\ne j\le n\) and \(x \in {\mathbb {R}}\), where \(c_1\) and \(c_2\) are two positive constants, and \(q_F\) is defined in Assumption 5.
-
(ii)
$$\begin{aligned} \mathsf Var (Y_i I_{(|Y_i|\le T)} K_{h_n}(x-X_i))\le c_3 h_n^{-1}, \end{aligned}$$(6.3)
and
$$\begin{aligned}&\left| \mathsf Cov \left( Y_i I_{(|Y_i|\le T)} K_{h_n}(x-X_i),Y_j I_{(|Y_j|\le T)} K_{h_n}(x-X_j)\right) \right| \nonumber \\&\quad \le \left\{ \begin{aligned}&c_4\min \{h_n^{-1+q_F},h_n^{-2}|i-j|^{-\gamma }\}, \ \{X_t,Y_t\}_{t=1}^n \in PSM,\\&c_4\min \{h_n^{-1+q_F},h_n^{-2}\rho ^{|i-j|}\}\ \ \ \ \ , \{X_t,Y_t\}_{t=1}^n \in GSM\\ \end{aligned} \right. \end{aligned}$$(6.4)for any \( 1\le i\ne j\le n\) and \(x \in {\mathbb {R}}\), where \(c_3\) and \(c_4\) are two positive constants which do not dependent on T.
Proof of Lemma 2
(i) Noting that f, the density function of X, is bounded and \(\Vert K\Vert _2<\infty \), we can prove that
with variable substitution \(z=(u-x)/h_n\) and selecting \(c_1=\Vert f\Vert _\infty \Vert K\Vert _2^2\). In terms of the covariance, we have
where \(F^{(|i-j|)}\) is defined in Assumption 5. Letting \(\bar{p_F}\) satisfy \(p_F^{-1}+\bar{p_F}^{-1}=1\) and using Hölder inequality, we can prove that
where \(c_{2,1}\) is equal to \(\Vert K\Vert _{\bar{p_F}}^2\cdot C_2\), noting that \(\Vert K\Vert _{\bar{p_F}}<\infty \), which is implied by \(\Vert K\Vert _1<\infty \) and \(\Vert K\Vert _\infty <\infty \) in Assumption 4. Besides, by using the Billingsley’s inequality (c.f. Chapter 1 of Bosq (1998)), we have
where \(c_{2,2}=4\Vert K\Vert _\infty ^2\). Note that we have \(\alpha (|i-j|)\le C_1\rho ^{|i-j|}\) in Assumption 2 for GSM or \(\alpha (|i-j|)\le C_1|i-j|^{-\gamma }\) in Assumption 1 for PSM. Combining (6.5) and (6.6), and taking \(c_2=\max \{c_{2,1},c_{2,2}C_1\}\), we can prove (6.2).
(ii) We denote \(I_{(|Y_i|\le T)}\) as \(I_i\) for simplicity. Noting that \(\Vert K\Vert _2<\infty \), \(\Vert f\Vert _\infty <\infty \) and \(\sup _{x\in \mathsf supp [X_1]}\mathsf{E }\left[ Y_i^2|X_i=x\right] <C_4\) by Assumptions 4, 5 and 6 , respectively, we have
by selecting \(c_3=C_4\cdot \Vert f\Vert _\infty \cdot \Vert K\Vert _2^2\).
For the covariance, set \(A=\{(y_i,y_j):|y_i|\le T,|y_j|\le T\}\), we have
where \(c_{4,1}=C_3\Vert K\Vert _{\bar{p_G}}^2\) and \(C_3\) are defined in Assumption 5. Note that the last step follows from the Hölder inequality similar to (6.5) with \(\bar{p_G}\) satisfying \(p_G^{-1}+\bar{p_G}^{-1}=1\).
Next we prove the second part in the minimization function in (6.4). Note that, for any \(m\ge 1\) and \(i=1,\cdots ,n\),
and \(e^{C_5(\mathsf{E }\left[ |Y_i|^m\right] )^{1/m}}\le \mathsf{E }\left[ e^{C_5|Y_i|}\right] \le C_6\) by Assumption 6 and Jensen inequality. By Corollary 1.1 in Bosq (1998) together with (6.7), we have, for any \(m>2\),
with \(c_{4,2}(m)=2^{2-2/m}\cdot m/(m-2)(\mathsf{E }\left[ |Y_i|^m\right] \Vert K\Vert _\infty ^m)^{2/m}\). Let m tend to infinity, we have \(c_{4,2}(m)\rightarrow 4\cdot (\log (C_6)/C_5)^{2}\Vert K\Vert _\infty ^2:=c_{4,2}\), thus
Using \(\alpha (|i-j|)\le C_1\rho ^{|i-j|}\) in Assumption 2 or \(\alpha (|i-j|)\le C_1|i-j|^{-\gamma }\) in Assumption 1, and taking \(c_4=\max \{c_{4,1},c_{4,2}C_1\}\), we can prove (6.4). \(\square \)
The following lemma shows the uniform bound of the N–W estimator of a PSM process. Note that (i) the bandwidth is selected based on the sample size n, (ii) the estimator \({\widehat{f}}_{s,u}(x;h_n)\) is constructed based on subsample set \(\{X_t,Y_t\}_{t=s}^u\), for \(1\le s\le u\le n\), and (iii) the uniform bound is considered with respect to time t.
Lemma 3
Suppose the process \(\{X_t,Y_t\}_{t=1}^n\) is PSM and Assumptions 3–6 are satisfied. Let \({\widehat{f}}_{1,t}(x;h_n)\) and \({\widehat{f}}_{t+1,n}(x;h_n)\) be defined in (2.4). Then we have for \(\forall x \in {\mathbb {R}}\), under the model (2.1)
and
Proof of Lemma 3
It is clear that if for some \(\eta >0\),
we can show (6.8) by using the Borel-Cantelli lemma. Next we prove (6.10). Actually,
noting that \(\log n/\sqrt{n}>\sqrt{\delta } \log t/\sqrt{t}\) when \(\lfloor n\delta \rfloor \le t<n\). A sufficient condition is that for some \(\eta >0\) and \(\delta _1>0\)
when n is large enough, where \(c_5\) is a positive constant (we still use the notation \(\eta \) for \(\eta \sqrt{\delta }\)).
Next we prove (6.11) using Lemma 1(ii). Let \(Z_{1,s,n}=K_{h_n}(x-X_s)-\mathsf{E }\left[ K_{h_n}(x-X_s)\right] \) for \(s=1,\cdots ,n\), and denote the partial sum of \(Z_{1,s,n}\) as \(S_t=\sum _{s=1}^{t}Z_{1,s,n}\).
Firstly, we want to derive the order of \(\sigma ^2(q)\) and \(v^2(q)\) defined in Lemma 1 with the sequence \(\{X_t\}_{t=1}^n\) replaced by the sequence \(\{Z_{1,s,n}\}_{s=1}^t\). Taking \(\varepsilon =\varepsilon _t=(th_n)^{-1/2}\log t\), \(q=q_t=\lfloor t^{1/2}h_n^{-1/2}\rfloor \), and \(p=t/(2q)\), we have \(|Z_{1,s,n}|\le \Vert K \Vert _{\infty }\cdot h_n^{-1}\) and \(q_t\le t/2\) for large n. Using the partition method similar to the proof of Theorem 3.3 in Johannes and Rao (2011), we have (define \(p'=\lfloor p\rfloor +2\))
with the partition point \(B=\lfloor h_n^{-q_F}\rfloor \). Then using Lemma 2, we can obtain for large n
where the term \(B^{1-\gamma }\) is induced by substituting the sum with an integral and the last row follows from \(q_F(\gamma -1)>1\) in Assumption 5. So for \(v^2(q)\), we have
for n large enough, noting that \(p'/(p^2)\simeq 2\varepsilon _t/\log t=o(\varepsilon _t)\) when \(\lfloor \delta n\rfloor \le t\le n-\lfloor \delta n\rfloor \). Then using Lemma 1(ii) and (6.13),
we have for \(\eta >0\)
Because \(q_t\simeq t^{1/2}h_n^{-1/2}\) and thereby \(\varepsilon _tq_th_n\simeq (\log t)\), by selecting \(\eta >\sqrt{8(2+\delta _1)\Vert K\Vert _\infty }\), we have \(A_{1,t}\le 4t^{-{\eta ^2}/(8\Vert K\Vert _\infty )}=O\left( n^{-(2+\delta _1)}\right) \). In terms of \(A_{2,t}\), noting that \((h_n^{-1}/\varepsilon _t)^{1/2}q_t\le t^{3/4}h_n^{-3/4}\le n^{3/4}h_n^{-3/4}\rightarrow \infty \) and \(\alpha \left( \left\lfloor \frac{t}{2q_t}\right\rfloor \right) \le C_1 {\left\lfloor \frac{t}{2q_t}\right\rfloor }^{-\gamma }\le C_1 {\left\lfloor \frac{\sqrt{th_n}}{2}\right\rfloor }^{-\gamma }=O(n^{-\gamma /2}h_n^{-\gamma /2})\), we can obtain
when \(h_n\simeq n^{-\omega }\) for some \(0<\omega \le \frac{\gamma /2-11/4-\delta _1}{\gamma /2+3/4}<1-\frac{14}{2\gamma +3}\).
The proof of (6.9) is similar to the proof of (6.8) by considering the sequence \(\{Z_{1,s,n}\}_{s=t+1}^n\). Thus we omit the proof. Then we complete the proof of this lemma. \(\square \)
Lemma 4
Suppose the process \(\{X_t,Y_t\}_{t=1}^n\) is GSM and Assumptions 3–6 are satisfied. Let \({\widehat{f}}_{1,t}(x;h_n)\) and \({\widehat{f}}_{t+1,n}(x;h_n)\) be defined in (2.4), then we have for \(\forall x \in {\mathbb {R}}\), under the model (2.1)
and
Proof of Lemma 4
The proof of this lemma is similar to the proof of Lemma 3. Because of similarity, we only prove (6.15).
We need to prove (6.11) by Lemma 1(ii). Using the same notation with Lemma 3, we have \(\sigma ^2(q)=O(p'h_n^{-1})\), hence \(v^2(q)\le \Vert K \Vert _{\infty }\cdot h_n^{-1}\varepsilon _t\) for n large enough (see Lemma 2.1 of Bosq (1998)). Then we still have (6.14). By selecting \(\eta >\sqrt{8(2+\delta _1)\Vert K\Vert _\infty }\), we have
In terms of \(A_{2,t}\), note that \(\log t/\sqrt{th_n}\rightarrow 0\), which implies that \(th_n\) and thus \(\log t-\log h_n^{-1}\rightarrow \infty \), and therefore \(\log t\) and \(\log h_n^{-1}\) can be bounded by \(\sqrt{th_n}\). We have
where \(c_7\) and \(c_8\) are two positive constants, noting that \(\varepsilon _t\rightarrow 0\). Combining (6.14), (6.17) and (6.18), we can prove (6.11). Thus we complete the proof of this lemma. \(\square \)
Lemma 5
Suppose the process \(\{X_t,Y_t\}_{t=1}^n\) is PSM and Assumptions 3–6 are satisfied. Let \({\widehat{g}}_{1,t}(x;h_n)\) and \({\widehat{g}}_{t+1,n}(x;h_n)\) be defined in (2.3). Then we have for \(\forall x \in {\mathbb {R}}\), under the model (2.1)
and
Proof of Lemma 5
The proof of this lemma is similar to that of Lemma 3. The only difference is that \(Y_i\) may not be bounded, and we need to adopt the idea of truncation before using Lemma 1(ii). Analogously, our goal is to prove for some \(\eta >0\)
which can be proved by
where \(\varepsilon _t=\log ^2t/\sqrt{th_n}\) and \(\delta _1\) is a tiny positive constant.
Next, we prove (6.21). For \(s=1,\cdots ,n\) , define \(\bar{Y}_s=Y_sI_{(|Y_s|\le T_{t})}\) and \(\widetilde{Y}_s=Y_sI_{(|Y_s|>T_{t})}\) with \(T_{t}=c_9\log t\), where \(c_9\) is a positive constant which will be determined later. Then
Denote the partial sums in (6.22) as \({\bar{S}}_{t,n}=\sum _{s=1}^t\bar{Z}_{2,s,n}\) and \({\widetilde{S}}_{t,n}=\sum _{s=1}^t\widetilde{Z}_{2,s,n}\). To use Lemma 1(ii), set \(q=q_t=\lfloor t^{1/2}h_{n}^{-1/2}\rfloor \). Then (6.21) can be written as follows,
so we can prove this lemma by showing that
and
For (6.23), before using the similar method by the inequality in Lemma 1(ii) like before, we still need to show the bound of \(\sigma ^2(q)\). Together with Lemma 2(ii), it immediately follows that, like (6.12) by using \(B = \lfloor h_n^{-q_G}\rfloor \), for large n
when \(q_G( \gamma -1)>1\). Hence
when n is sufficiently large. Then we can use Lemma 1(ii) like the proof before and derive that for \(\eta >0\)
For \(A_{3,t}\), we have
by selecting \(\eta >8\sqrt{c_9(2+\delta _1)\Vert K \Vert _{\infty }}\). For \(A_{4,t}\), we have
where the existence of \(\delta _1\) in the last equality follows from Assumption 3. Then we have proved (6.23).
In terms of (6.24), using Cauchy-Schwarz and Markov inequality, we have
by letting \(c_9>(5+2\delta _1)/C_5\), noting that \(\Vert K\Vert _2<\infty \), f is bounded, \(\sup \limits _{x\in \mathsf supp [{ X_1}]}\)\(\mathsf{E }\left[ Y_1^2|X_1=x\right] <\infty \), and \(\mathsf{E }\left[ e^{C_5|Y_1|}\right] <\infty \).
Combing (6.23) and (6.24), we obtain (6.21). Hence the proof is completed. \(\square \)
Lemma 6
Suppose the process \((X_t,Y_t)\) is GSM and Assumptions 3–6 are satisfied. Let \({\widehat{g}}_{1,t}(x;h_n)\) and \({\widehat{g}}_{t+1,n}(x;h_n)\) be defined in (2.3). Then we have for \(\forall x \in {\mathbb {R}}\), under the model (2.1)
and
Proof of Lemma 6
We still use the notation in Lemma 5, and want to show (6.23) and (6.24). Together with Lemma 2(ii), it still holds that
noting that \(\frac{p'h_n^{-2}}{p'h_n^{-1}}\cdot \rho ^{h_n^{-q_G} }=h_n^{-1}\rho ^{h_n^{-q_G} }\simeq (h_n^{-q_G} )^{\frac{1}{q_G}}\cdot \rho ^{h_n^{-q_G} }\rightarrow 0\). So we only need to show the part \(A_{4,t}\) containing mixing-coefficient as follows,
where \(c_{12}, c_{13}\) and \(c_{14}\) are strictly positive constants, noting that the last second row is deduced like (6.18), and \(\varepsilon _t\rightarrow 0\). \(\square \)
Lemma 7
Suppose that the assumptions in Lemmas 3 and 5 (or Lemmas 4 and 6 ) are satisfied. Let \({{\widehat{\varphi }}}_{1,t}(x;h_n)\) and \({{\widehat{\varphi }}}_{t+1,n}(x;h_n)\) be defined in (2.2). If the grid point \(x_i\in {\mathcal {X}}\), then we have under the model (2.1)
and
Proof of Lemma 7
Because of similarity, we only show the first equation. Consider the decomposition
By Lemma 3, \(\widehat{f}_{1,t}(x_i;h_n)-\mathsf{E }[\widehat{f}_{1,t}(x_i;h_n)]=o_{a.s.}(1)\) uniformly over t. Since \(x_i\in {{{\mathcal {X}}}}=(\mathsf supp [X_1])^\circ \), the density function f has a nonzero lower bound in a sufficient small neighbourhood of \(x_i\). We have \(\mathsf{E }[\widehat{f}_{1,t}(x_i;h_n)]=\mathsf{E }\left[ K_{h_n}({X_1-x_i})\right] =h_n^{-1}\int _{{\mathbb {R}}} K((u-x_i)/h_n)f(u)du=\int _{{\mathbb {R}}} K(z)f(z h_n+x_i)dz\ge c_{15}\) for some positive constant \(c_{15}\) if \(h_n\) is small enough. Then, by Lemmas 3 and 5 , we have
noting that \(\mathsf{E }\left[ |\widehat{g}_{1,t}(x_i;h_n)|\right] \le \mathsf{E }\left[ |Y_iK_{h_n}(X_1-x_i)|\right] =\int _{{\mathbb {R}}}\mathsf{E }\left[ |Y_1|\mid X_1=h_nz+x_i\right] \cdot K(z)f(h_nz+x_i)dz\le C_4||K||_1||f||_\infty \).
Hence we complete the proof of this lemma. \(\square \)
Proof of Theorem 1
Since there is no change point, we have \(\mathsf{E }[\widehat{f}_{1,t}(x_i;h_n)]=\mathsf{E }[\widehat{f}_{t+1,n}(x_i;h_n)]\) and \(\mathsf{E }[\widehat{g}_{1,t}(x_i;h_n)]=\mathsf{E }[\widehat{g}_{t+1,n}(x_i;h_n)]\), and therefore by Lemma 7
We complete the proof of the theorem. \(\square \)
Proof of Theorem 2
By definition,
Note that \((\mathsf{E }[K_{h_n}(X_1-x_i)])\) is bounded. When \(x_i\in {\mathcal {X}}\cap {\mathcal {Y}}\), both \(\varphi _1(x)-\varphi _2(x)\) and f(x) are bounded away from zero in a small neighbour of \(x_i\). Therefore \(\varLambda _{h_n}^2(x_i)\) is also bounded away from zero when n is large enough.
Next we prove (3.2). Considering the special point \(t=k\), it is obvious that
Note that the sequences \(\{ X_s,Y_s \}_{s=1}^k\) and \(\{ X_s,Y_s \}_{s=k+1}^n\) are strictly stationary. By definition and Lemma 7, we have
Thus, we complete the proof of this theorem. \(\square \)
Proof of Theorem 3
(i) It is a direct corollary from Theorem 1 and 2 .
Now we prove (ii). From the statement in the proof of Theorem 2, we have
almost surely when \(n\rightarrow \infty \). If we can show that
almost surely, for any small \(\epsilon >0\) when \(n\rightarrow \infty \), then (6.31) and (6.32) imply that \({\widehat{k}}\ge k-n\varepsilon \) almost surely when \(n\rightarrow \infty \). Using the same method for the case when \(k+n\varepsilon \le t\le n-\varDelta _n\), we can obtain \({\widehat{k}}\le k+n\varepsilon \) almost surely when \(n\rightarrow \infty \). Combining these two inequalities we can show that \(({\widehat{k}}-k)/n=o_{a.s.}(1)\) by letting \(\varepsilon \rightarrow 0\).
Next we prove (6.32). When \(t\le k-n\varepsilon \), on the one hand, we have
uniformly in t by Lemma 7. On the other hand,
We show that \(B_3(x_i)\) is negligible and \(B_4(x_i)\) is the leading term. Similar to the argument of proof for Lemma 5, we can show that the asymptotic order of \({\widehat{g}}_{t+1,k}(x_i;h_n)-\mathsf{E }[{\widehat{g}}_{t+1,k}(x_i;h_n)]\) is also \(\frac{\log ^2 n}{\sqrt{nh_n}}\), as the sample size \(k-t\) is of order n. Note that \({\widehat{f}}_{t+1,n}\) is bounded away from zero almost surely when n is large enough. Since \((k-t)/(n-t)<\theta /(1-\theta )\) and \((n-k)/(n-t)<1\), we have by Lemma 5 (or Lemma 6)
In terms of \(B_4(x_i)\), we have by Lemma 3
Therefore,
Combining (6.33)–(6.35), we can obtain uniformly in t
Note that
Thus, we have (6.32), and we complete the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Yang, Q., Li, YN. & Zhang, Y. Change point detection for nonparametric regression under strongly mixing process. Stat Papers 61, 1465–1506 (2020). https://doi.org/10.1007/s00362-020-01196-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01196-y
Keywords
- Change point detection
- CUSUM statistic
- Nonparametric regression
- Strongly mixing process
- Structural change