Skip to main content
Log in

Adaptive efficient estimation for generalized semi-Markov big data models

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019a). Robust adaptive efficient estimation for semi-Markov nonparametric regression models. Statistical Inference for Stochastic Processes, 22(2), 187–231.

    Article  MathSciNet  Google Scholar 

  • Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019b). Robust statistical signal processing in semi-Markov nonparametric regression models. Les Annales de l’I.S.U.P, 63(2–3), 45–56.

    MATH  Google Scholar 

  • Beltaief, S., Chernoyarov, O., Pergamenshchikov, S. M. (2020). Model selection for the robust efficient signal processing observed with small Lévy noise. Annals of the Institute of Statistical Mathematics, 72, 1205–1235.

    Article  MathSciNet  Google Scholar 

  • Barbu, V. S., Limnios, N. (2008). Semi markov chains and hidden semi-markov models toward applications their use in reliability and DNA analysis. Lecture notes in statistics. New York: Springer.

    MATH  Google Scholar 

  • Barndorff-Nielsen, O. E., Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial mathematics. Journal of the Royal Statistical Society Series B (Statistical Methodology), 63, 167–241.

    Article  MathSciNet  Google Scholar 

  • Biard, R., Saussereau, B. (2014). Fractional Poisson processes: Long-range dependence and applications in ruin theory. Journal of Applied Probability, 51, 727–740.

    Article  MathSciNet  Google Scholar 

  • Fourdrinier, D., Pergamenshchikov, S. M. (2007). Improved selection model method for the regression with dependent noise. Annals of the Institute of Statistical Mathematics, 59(3), 435–464.

    Article  MathSciNet  Google Scholar 

  • Fujimori, K. (2019). The Dantzig selector for a linear model of diffusion processes. Statistical Inference for Stochastic Processes, 22, 475–498.

    Article  MathSciNet  Google Scholar 

  • Hastie, T., Friedman, J., Tibshirani, R. (2008). The elements of statistical leaning. data mining, inference and prediction (2nd ed.). New York: Springer, Springer series (in Statistics).

    MATH  Google Scholar 

  • Ibragimov, I. A., Khasminskii, R. Z. (1981). Statistical estimation: Asymptotic theory. New York: Springer.

    Book  Google Scholar 

  • Kassam, S. A. (1988). Signal detection in Non-Gaussian Noise. IX. New York: Springer.

    Book  Google Scholar 

  • Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Par.t 1. Oracle Inequalities. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 3(7), 23–41.

    Google Scholar 

  • Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 4(8), 31–45.

    Google Scholar 

  • Konev, V. V., Pergamenshchikov, S. M. (2012). Efficient robust nonparametric in a semimartingale regression model. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 48(4), 1217–1244.

    Article  MathSciNet  Google Scholar 

  • Konev, V. V., Pergamenshchikov, S. M. (2015). Robust model selection for a semimartingale continuous time regression from discrete data. Stochastic Processes and their Applications, 125, 294–326.

    Article  MathSciNet  Google Scholar 

  • Kutoyants, Yu. A. (1994). Identification of dynamical systems with small noise. Dordrecht: Kluwer Academic Publishers Group.

    Book  Google Scholar 

  • Laskin, N. (2003). Fractional Poisson processes. Communications in Nonlinear Science and Numerical Simulation, 8, 201–213.

    Article  MathSciNet  Google Scholar 

  • Liptser, R. S., Shiryaev, A. N. (1989). Theory of martingales. New York: Springer.

    Book  Google Scholar 

  • Maheshwari, A., Vellaisamy, P. (2016). On the long - range dependence of fractional Poisson processes. Journal of Applied Probability, 53(4), 989–1000.

    Article  MathSciNet  Google Scholar 

  • Middleton, D. (1979). Canonical non-Gaussian noise models: Their implications for measurement and for prediction of receiver performance. IEEE Transactions on Electromagnetic Compatibility, 21, 209–220.

    Article  Google Scholar 

  • Novikov, A. A. (1975). On discontinuous martingales. Theory of Probability and its Applications, 20(1), 11–26.

    Article  MathSciNet  Google Scholar 

  • Pinsker, M. S. (1981). Optimal filtration of square integrable signals in gaussian white noise. Problems of Transmission Information, 17, 120–133.

    Google Scholar 

  • Repin, O. N., Saichev, A. I. (2000). Fractional Poisson law. Radiophysics and Quantum Electronics, 43(9), 738–741.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serguei Pergamenchtchikov.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by RSF, Project No 20-61-47043 (National Research Tomsk State University, Russia).

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 448 KB)

Appendix

Appendix

1.1 Property of the penalty term

Lemma 4

For any \(n\ge \,1\) and \(\lambda \in \varLambda \),

$$\begin{aligned} P_{{T}}(\lambda ) \le \mathcal{R}_{{Q}}(\widehat{S}_{{\lambda }},S) +\frac{\mathbf{L}_{{1,Q}}}{T}, \end{aligned}$$

where the coefficient \(P_{{T}}(\lambda )\) is defined in (68) and \(\mathbf{L}_{{1,Q}}\) is defined in (64).

Proof

From (30) and (33) we obtain

$$\begin{aligned} \text{ Err }(\lambda ) \ge \sum _{{j=1}}^{p} \left( \lambda (j) \widehat{\theta }_{{j,p}} - \overline{\theta }_{{j,p}} \right) ^2 = \sum _{{j=1}}^{p} \left( (\lambda (j)-1) \overline{\theta }_{{j,p}}+ \frac{\lambda (j)}{T}\xi _{{j,p}} \right) ^2 \,. \end{aligned}$$

Now Proposition 4 implies

$$\begin{aligned} \mathcal{R}_{{Q}}(\widehat{S}_{{\lambda }},S)= \mathbf{E}_{{Q}}\, \text{ Err }(\lambda ) \ge \, \frac{1}{T}\sum _{{j=1}}^{p} \lambda ^2(j) \mathbf{E}_{{Q}}\,\xi ^2_{{j,p}} \ge \, P_{{T}}(\lambda )-\frac{\mathbf{L}_{{1,Q}}}{T} \,. \end{aligned}$$

Hence we obtain the result. \(\square \)

1.2 Properties of the Fourier coefficients

Lemma 5

Let f be an absolutely continuous function, \(f: [0,1]\rightarrow {{\mathbb {R}}},\) with \(\Vert \dot{f}\Vert <\infty \) and g be a simple function, \(g: [0,1]\rightarrow {{\mathbb {R}}}\) of the form \( g(t)=\sum _{j=1}^p\,c_{{j}}\,\chi _{(t_{j-1},\mathbf{t}_{{j}}]}(t),\) where \(c_{{j}}\) are some constants. Then, for any \(\varepsilon >0,\) the function \(\varDelta =f-g\) satisfies the following inequalities

$$\begin{aligned} \Vert \varDelta \Vert ^{2}\le (1+\widetilde{\varepsilon })\Vert \varDelta \Vert ^{2}_{{p}} + (1+\widetilde{\varepsilon }^{-1})\frac{\Vert \dot{f}\Vert ^{2}}{p^{2}}\,, \quad \Vert \varDelta \Vert ^{2}_{{p}}\le (1+\widetilde{\varepsilon })\Vert \varDelta \Vert ^{2} + (1+\widetilde{\varepsilon }^{-1})\frac{\Vert \dot{f}\Vert ^{2}}{p^{2}} \,. \end{aligned}$$

Lemma 6

Let the function S(t) in (3) be absolutely continuous and have an absolutely integrable derivative. Then the coefficients \((\overline{\theta }_{{j,p}})_{1\leqslant j \leqslant p}\) defined in (29) satisfy the inequalities \(\max _{{2\leqslant j \leqslant p}} j \vert \overline{\theta }_{{j,p}} \vert \leqslant 2 \sqrt{2} \int ^{1}_{{0}}\vert \dot{S}(t) \vert \mathrm {d}t\).

Lemma 7

For any \(p\ge 2\), \(1\le N\le p\) and \(r>0\), the coefficients \((\theta _{{j,p}})_{{1\le j\le p}}\) of functions S from the class \(\mathcal{W}_{{\mathbf{r},1}}\) satisfy, for any \(\widetilde{\varepsilon }>0\), the following inequality \( \sum ^{p}_{{j=N}} \theta ^{2}_{{j,p}} \, \le \,(1+\widetilde{\varepsilon }) \,\sum _{{j\ge N}}\,\theta ^{2}_{{j}} \, +(1+\widetilde{\varepsilon }^{-1})\,r\,p^{-2}\).

Lemma 8

For any \(p\ge 2\) and \(r>0\), the coefficients \((\theta _{{j,p}})_{{1\le j\le p}}\) of functions S satisfy the inequality \( \max _{{1\le j\le p}} \,\sup _{{S\in \mathcal{W}_{{\mathbf{r},1}}}} \left( |\theta _{{j,p}} - \theta _{{j}}| -2\pi \sqrt{r} \,j\,p^{-1} \right) \, \le 0\).

Lemma 9

For any \(p\ge 2\) and \(r>0\) the correction coefficients from (29) satisfy the inequality \( \sup _{{S\in \mathcal{W}_{{\mathbf{r},2}}}} \sum ^{p}_{{j=1}} h^{2}_{{j,p}} \le \,3r\,p^{-2}\).

Lemmas 59 are proven in Konev and Pergamenshchikov (2015).

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barbu, V.S., Beltaief, S. & Pergamenchtchikov, S. Adaptive efficient estimation for generalized semi-Markov big data models. Ann Inst Stat Math 74, 925–955 (2022). https://doi.org/10.1007/s10463-022-00820-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-022-00820-y

Keywords

Navigation