Abstract
In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.
Similar content being viewed by others
References
Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019a). Robust adaptive efficient estimation for semi-Markov nonparametric regression models. Statistical Inference for Stochastic Processes, 22(2), 187–231.
Barbu, V. S., Beltaief, S., Pergamenshchikov, S. M. (2019b). Robust statistical signal processing in semi-Markov nonparametric regression models. Les Annales de l’I.S.U.P, 63(2–3), 45–56.
Beltaief, S., Chernoyarov, O., Pergamenshchikov, S. M. (2020). Model selection for the robust efficient signal processing observed with small Lévy noise. Annals of the Institute of Statistical Mathematics, 72, 1205–1235.
Barbu, V. S., Limnios, N. (2008). Semi markov chains and hidden semi-markov models toward applications their use in reliability and DNA analysis. Lecture notes in statistics. New York: Springer.
Barndorff-Nielsen, O. E., Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial mathematics. Journal of the Royal Statistical Society Series B (Statistical Methodology), 63, 167–241.
Biard, R., Saussereau, B. (2014). Fractional Poisson processes: Long-range dependence and applications in ruin theory. Journal of Applied Probability, 51, 727–740.
Fourdrinier, D., Pergamenshchikov, S. M. (2007). Improved selection model method for the regression with dependent noise. Annals of the Institute of Statistical Mathematics, 59(3), 435–464.
Fujimori, K. (2019). The Dantzig selector for a linear model of diffusion processes. Statistical Inference for Stochastic Processes, 22, 475–498.
Hastie, T., Friedman, J., Tibshirani, R. (2008). The elements of statistical leaning. data mining, inference and prediction (2nd ed.). New York: Springer, Springer series (in Statistics).
Ibragimov, I. A., Khasminskii, R. Z. (1981). Statistical estimation: Asymptotic theory. New York: Springer.
Kassam, S. A. (1988). Signal detection in Non-Gaussian Noise. IX. New York: Springer.
Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Par.t 1. Oracle Inequalities. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 3(7), 23–41.
Konev, V. V., Pergamenshchikov, S. M. (2009). Nonparametric estimation in a semimartingale regression model. Part 2. Robust asymptotic efficiency. Vestnik Tomskogo Gosudarstvennogo Universiteta. Matematika i Mekhanika, 4(8), 31–45.
Konev, V. V., Pergamenshchikov, S. M. (2012). Efficient robust nonparametric in a semimartingale regression model. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 48(4), 1217–1244.
Konev, V. V., Pergamenshchikov, S. M. (2015). Robust model selection for a semimartingale continuous time regression from discrete data. Stochastic Processes and their Applications, 125, 294–326.
Kutoyants, Yu. A. (1994). Identification of dynamical systems with small noise. Dordrecht: Kluwer Academic Publishers Group.
Laskin, N. (2003). Fractional Poisson processes. Communications in Nonlinear Science and Numerical Simulation, 8, 201–213.
Liptser, R. S., Shiryaev, A. N. (1989). Theory of martingales. New York: Springer.
Maheshwari, A., Vellaisamy, P. (2016). On the long - range dependence of fractional Poisson processes. Journal of Applied Probability, 53(4), 989–1000.
Middleton, D. (1979). Canonical non-Gaussian noise models: Their implications for measurement and for prediction of receiver performance. IEEE Transactions on Electromagnetic Compatibility, 21, 209–220.
Novikov, A. A. (1975). On discontinuous martingales. Theory of Probability and its Applications, 20(1), 11–26.
Pinsker, M. S. (1981). Optimal filtration of square integrable signals in gaussian white noise. Problems of Transmission Information, 17, 120–133.
Repin, O. N., Saichev, A. I. (2000). Fractional Poisson law. Radiophysics and Quantum Electronics, 43(9), 738–741.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported by RSF, Project No 20-61-47043 (National Research Tomsk State University, Russia).
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
1.1 Property of the penalty term
Lemma 4
For any \(n\ge \,1\) and \(\lambda \in \varLambda \),
where the coefficient \(P_{{T}}(\lambda )\) is defined in (68) and \(\mathbf{L}_{{1,Q}}\) is defined in (64).
Proof
Now Proposition 4 implies
Hence we obtain the result. \(\square \)
1.2 Properties of the Fourier coefficients
Lemma 5
Let f be an absolutely continuous function, \(f: [0,1]\rightarrow {{\mathbb {R}}},\) with \(\Vert \dot{f}\Vert <\infty \) and g be a simple function, \(g: [0,1]\rightarrow {{\mathbb {R}}}\) of the form \( g(t)=\sum _{j=1}^p\,c_{{j}}\,\chi _{(t_{j-1},\mathbf{t}_{{j}}]}(t),\) where \(c_{{j}}\) are some constants. Then, for any \(\varepsilon >0,\) the function \(\varDelta =f-g\) satisfies the following inequalities
Lemma 6
Let the function S(t) in (3) be absolutely continuous and have an absolutely integrable derivative. Then the coefficients \((\overline{\theta }_{{j,p}})_{1\leqslant j \leqslant p}\) defined in (29) satisfy the inequalities \(\max _{{2\leqslant j \leqslant p}} j \vert \overline{\theta }_{{j,p}} \vert \leqslant 2 \sqrt{2} \int ^{1}_{{0}}\vert \dot{S}(t) \vert \mathrm {d}t\).
Lemma 7
For any \(p\ge 2\), \(1\le N\le p\) and \(r>0\), the coefficients \((\theta _{{j,p}})_{{1\le j\le p}}\) of functions S from the class \(\mathcal{W}_{{\mathbf{r},1}}\) satisfy, for any \(\widetilde{\varepsilon }>0\), the following inequality \( \sum ^{p}_{{j=N}} \theta ^{2}_{{j,p}} \, \le \,(1+\widetilde{\varepsilon }) \,\sum _{{j\ge N}}\,\theta ^{2}_{{j}} \, +(1+\widetilde{\varepsilon }^{-1})\,r\,p^{-2}\).
Lemma 8
For any \(p\ge 2\) and \(r>0\), the coefficients \((\theta _{{j,p}})_{{1\le j\le p}}\) of functions S satisfy the inequality \( \max _{{1\le j\le p}} \,\sup _{{S\in \mathcal{W}_{{\mathbf{r},1}}}} \left( |\theta _{{j,p}} - \theta _{{j}}| -2\pi \sqrt{r} \,j\,p^{-1} \right) \, \le 0\).
Lemma 9
For any \(p\ge 2\) and \(r>0\) the correction coefficients from (29) satisfy the inequality \( \sup _{{S\in \mathcal{W}_{{\mathbf{r},2}}}} \sum ^{p}_{{j=1}} h^{2}_{{j,p}} \le \,3r\,p^{-2}\).
About this article
Cite this article
Barbu, V.S., Beltaief, S. & Pergamenchtchikov, S. Adaptive efficient estimation for generalized semi-Markov big data models. Ann Inst Stat Math 74, 925–955 (2022). https://doi.org/10.1007/s10463-022-00820-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-022-00820-y