Adaptive efficient estimation for generalized semi-Markov big data models

Barbu, Vlad Stefan; Beltaief, Slim; Pergamenchtchikov, Serguei

doi:10.1007/s10463-022-00820-y

Adaptive efficient estimation for generalized semi-Markov big data models

Published: 05 March 2022

Volume 74, pages 925–955, (2022)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Vlad Stefan Barbu¹,
Slim Beltaief² &
Serguei Pergamenchtchikov^1,3

247 Accesses
1 Citation
Explore all metrics

Abstract

In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new computational framework for log-concave density estimation

Article Open access 30 April 2024

Distributionally robust stochastic programs with side information based on trimmings

Article Open access 22 November 2021

Residuals-based distributionally robust optimization with covariate information

Article 26 September 2023

References

Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019a). Robust adaptive efficient estimation for semi-Markov nonparametric regression models. Statistical Inference for Stochastic Processes, 22(2), 187–231.
Article MathSciNet Google Scholar
Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019b). Robust statistical signal processing in semi-Markov nonparametric regression models. Les Annales de l’I.S.U.P, 63(2–3), 45–56.
MATH Google Scholar
Beltaief, S., Chernoyarov, O., Pergamenshchikov, S. M. (2020). Model selection for the robust efficient signal processing observed with small Lévy noise. Annals of the Institute of Statistical Mathematics, 72, 1205–1235.
Article MathSciNet Google Scholar
Barbu, V. S., Limnios, N. (2008). Semi markov chains and hidden semi-markov models toward applications their use in reliability and DNA analysis. Lecture notes in statistics. New York: Springer.
MATH Google Scholar
Barndorff-Nielsen, O. E., Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial mathematics. Journal of the Royal Statistical Society Series B (Statistical Methodology), 63, 167–241.
Article MathSciNet Google Scholar
Biard, R., Saussereau, B. (2014). Fractional Poisson processes: Long-range dependence and applications in ruin theory. Journal of Applied Probability, 51, 727–740.
Article MathSciNet Google Scholar
Fourdrinier, D., Pergamenshchikov, S. M. (2007). Improved selection model method for the regression with dependent noise. Annals of the Institute of Statistical Mathematics, 59(3), 435–464.
Article MathSciNet Google Scholar
Fujimori, K. (2019). The Dantzig selector for a linear model of diffusion processes. Statistical Inference for Stochastic Processes, 22, 475–498.
Article MathSciNet Google Scholar
Hastie, T., Friedman, J., Tibshirani, R. (2008). The elements of statistical leaning. data mining, inference and prediction (2nd ed.). New York: Springer, Springer series (in Statistics).
MATH Google Scholar
Ibragimov, I. A., Khasminskii, R. Z. (1981). Statistical estimation: Asymptotic theory. New York: Springer.
Book Google Scholar
Kassam, S. A. (1988). Signal detection in Non-Gaussian Noise. IX. New York: Springer.
Book Google Scholar
Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Par.t 1. Oracle Inequalities. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 3(7), 23–41.
Google Scholar
Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 4(8), 31–45.
Google Scholar
Konev, V. V., Pergamenshchikov, S. M. (2012). Efficient robust nonparametric in a semimartingale regression model. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 48(4), 1217–1244.
Article MathSciNet Google Scholar
Konev, V. V., Pergamenshchikov, S. M. (2015). Robust model selection for a semimartingale continuous time regression from discrete data. Stochastic Processes and their Applications, 125, 294–326.
Article MathSciNet Google Scholar
Kutoyants, Yu. A. (1994). Identification of dynamical systems with small noise. Dordrecht: Kluwer Academic Publishers Group.
Book Google Scholar
Laskin, N. (2003). Fractional Poisson processes. Communications in Nonlinear Science and Numerical Simulation, 8, 201–213.
Article MathSciNet Google Scholar
Liptser, R. S., Shiryaev, A. N. (1989). Theory of martingales. New York: Springer.
Book Google Scholar
Maheshwari, A., Vellaisamy, P. (2016). On the long - range dependence of fractional Poisson processes. Journal of Applied Probability, 53(4), 989–1000.
Article MathSciNet Google Scholar
Middleton, D. (1979). Canonical non-Gaussian noise models: Their implications for measurement and for prediction of receiver performance. IEEE Transactions on Electromagnetic Compatibility, 21, 209–220.
Article Google Scholar
Novikov, A. A. (1975). On discontinuous martingales. Theory of Probability and its Applications, 20(1), 11–26.
Article MathSciNet Google Scholar
Pinsker, M. S. (1981). Optimal filtration of square integrable signals in gaussian white noise. Problems of Transmission Information, 17, 120–133.
Google Scholar
Repin, O. N., Saichev, A. I. (2000). Fractional Poisson law. Radiophysics and Quantum Electronics, 43(9), 738–741.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire de Mathématiques Raphaël Salem, UMR 6085 CNRS-Université de Rouen Normandie, Avenue de l’Université, BP.12, 76801, Saint-Etienne-du-Rouvray, France
Vlad Stefan Barbu & Serguei Pergamenchtchikov
ALTEN de Toulouse, 9 Rue Alain Fournier, 31300, Toulouse, France
Slim Beltaief
International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, Tomsk State University, Tomsk, Russia
Serguei Pergamenchtchikov

Authors

Vlad Stefan Barbu
View author publications
You can also search for this author in PubMed Google Scholar
Slim Beltaief
View author publications
You can also search for this author in PubMed Google Scholar
Serguei Pergamenchtchikov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serguei Pergamenchtchikov.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by RSF, Project No 20-61-47043 (National Research Tomsk State University, Russia).

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 448 KB)

Appendix

1.1 Property of the penalty term

Lemma 4

For any $n\ge \,1$ and $\lambda \in \varLambda $,

$$\begin{aligned} P_{{T}}(\lambda ) \le \mathcal{R}_{{Q}}(\widehat{S}_{{\lambda }},S) +\frac{\mathbf{L}_{{1,Q}}}{T}, \end{aligned}$$

where the coefficient $P_{{T}}(\lambda )$ is defined in (68) and $\mathbf{L}_{{1,Q}}$ is defined in (64).

Proof

From (30) and (33) we obtain

$$\begin{aligned} \text{ Err }(\lambda ) \ge \sum _{{j=1}}^{p} \left( \lambda (j) \widehat{\theta }_{{j,p}} - \overline{\theta }_{{j,p}} \right) ^2 = \sum _{{j=1}}^{p} \left( (\lambda (j)-1) \overline{\theta }_{{j,p}}+ \frac{\lambda (j)}{T}\xi _{{j,p}} \right) ^2 \,. \end{aligned}$$

Now Proposition 4 implies

$$\begin{aligned} \mathcal{R}_{{Q}}(\widehat{S}_{{\lambda }},S)= \mathbf{E}_{{Q}}\, \text{ Err }(\lambda ) \ge \, \frac{1}{T}\sum _{{j=1}}^{p} \lambda ^2(j) \mathbf{E}_{{Q}}\,\xi ^2_{{j,p}} \ge \, P_{{T}}(\lambda )-\frac{\mathbf{L}_{{1,Q}}}{T} \,. \end{aligned}$$

Hence we obtain the result. $\square $

1.2 Properties of the Fourier coefficients

Lemma 5

Let f be an absolutely continuous function, $f: [0,1]\rightarrow {{\mathbb {R}}},$ with $\Vert \dot{f}\Vert <\infty $ and g be a simple function, $g: [0,1]\rightarrow {{\mathbb {R}}}$ of the form $ g(t)=\sum _{j=1}^p\,c_{{j}}\,\chi _{(t_{j-1},\mathbf{t}_{{j}}]}(t),$ where $c_{{j}}$ are some constants. Then, for any $\varepsilon >0,$ the function $\varDelta =f-g$ satisfies the following inequalities

$$\begin{aligned} \Vert \varDelta \Vert ^{2}\le (1+\widetilde{\varepsilon })\Vert \varDelta \Vert ^{2}_{{p}} + (1+\widetilde{\varepsilon }^{-1})\frac{\Vert \dot{f}\Vert ^{2}}{p^{2}}\,, \quad \Vert \varDelta \Vert ^{2}_{{p}}\le (1+\widetilde{\varepsilon })\Vert \varDelta \Vert ^{2} + (1+\widetilde{\varepsilon }^{-1})\frac{\Vert \dot{f}\Vert ^{2}}{p^{2}} \,. \end{aligned}$$

Lemma 6

Let the function S(t) in (3) be absolutely continuous and have an absolutely integrable derivative. Then the coefficients $(\overline{\theta }_{{j,p}})_{1\leqslant j \leqslant p}$ defined in (29) satisfy the inequalities $\max _{{2\leqslant j \leqslant p}} j \vert \overline{\theta }_{{j,p}} \vert \leqslant 2 \sqrt{2} \int ^{1}_{{0}}\vert \dot{S}(t) \vert \mathrm {d}t$.

Lemma 7

For any $p\ge 2$, $1\le N\le p$ and $r>0$, the coefficients $(\theta _{{j,p}})_{{1\le j\le p}}$ of functions S from the class $\mathcal{W}_{{\mathbf{r},1}}$ satisfy, for any $\widetilde{\varepsilon }>0$, the following inequality $ \sum ^{p}_{{j=N}} \theta ^{2}_{{j,p}} \, \le \,(1+\widetilde{\varepsilon }) \,\sum _{{j\ge N}}\,\theta ^{2}_{{j}} \, +(1+\widetilde{\varepsilon }^{-1})\,r\,p^{-2}$.

Lemma 8

For any $p\ge 2$ and $r>0$, the coefficients $(\theta _{{j,p}})_{{1\le j\le p}}$ of functions S satisfy the inequality $ \max _{{1\le j\le p}} \,\sup _{{S\in \mathcal{W}_{{\mathbf{r},1}}}} \left( |\theta _{{j,p}} - \theta _{{j}}| -2\pi \sqrt{r} \,j\,p^{-1} \right) \, \le 0$.

Lemma 9

For any $p\ge 2$ and $r>0$ the correction coefficients from (29) satisfy the inequality $ \sup _{{S\in \mathcal{W}_{{\mathbf{r},2}}}} \sum ^{p}_{{j=1}} h^{2}_{{j,p}} \le \,3r\,p^{-2}$.

Lemmas 5–9 are proven in Konev and Pergamenshchikov (2015).

About this article

Cite this article

Barbu, V.S., Beltaief, S. & Pergamenchtchikov, S. Adaptive efficient estimation for generalized semi-Markov big data models. Ann Inst Stat Math 74, 925–955 (2022). https://doi.org/10.1007/s10463-022-00820-y

Download citation

Received: 10 March 2021
Revised: 09 November 2021
Accepted: 04 January 2022
Published: 05 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10463-022-00820-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive efficient estimation for generalized semi-Markov big data models

Abstract

Access this article

Similar content being viewed by others

A new computational framework for log-concave density estimation

Distributionally robust stochastic programs with side information based on trimmings

Residuals-based distributionally robust optimization with covariate information

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 448 KB)

Appendix

1.1 Property of the penalty term

Lemma 4

Proof

1.2 Properties of the Fourier coefficients

Lemma 5

Lemma 6

Lemma 7

Lemma 8

Lemma 9

About this article

Cite this article

Keywords

Navigation

Adaptive efficient estimation for generalized semi-Markov big data models

Abstract

Access this article

Similar content being viewed by others

A new computational framework for log-concave density estimation

Distributionally robust stochastic programs with side information based on trimmings

Residuals-based distributionally robust optimization with covariate information

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 448 KB)

Appendix

Appendix

1.1 Property of the penalty term

Lemma 4

Proof

1.2 Properties of the Fourier coefficients

Lemma 5

Lemma 6

Lemma 7

Lemma 8

Lemma 9

About this article

Cite this article

Share this article

Keywords

Search

Navigation