Abstract
In this paper, we introduce a new smooth estimator for continuous distribution functions on the positive real half-line using Szasz–Mirakyan operators, similar to Bernstein’s approximation theorem. We show that the proposed estimator outperforms the empirical distribution function in terms of asymptotic (integrated) mean-squared error and generally compares favorably with other competitors in theoretical comparisons. Also, we conduct the simulations to demonstrate the finite sample performance of the proposed estimator.
Similar content being viewed by others
References
Altman, N., Léger, C. (1995). Bandwidth selection for Kernel distribution function estimation. Journal of Statistical Planning and Inference, 46(2), 195–214.
Babu, G. J., Canty, A. J., Chaubey, Y. P. (2002). Application of Bernstein polynomials for smooth estimation of a distribution and density function. Journal of Statistical Planning and Inference, 105(2), 377–392.
Bowman, A., Hall, P., Prvan, T. (1998). Bandwidth selection for the smoothing of distribution functions. Biometrika, 85(4), 799–808.
Duin, R. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C-25(11):1175–1179.
Falk, M. (1983). Relative efficiency and deficiency of Kernel type estimators of smooth distribution functions. Statistica Neerlandica, 37(2), 73–83.
Gramacki, A. (2018). Nonparametric Kernel density estimation and its computational aspects studies in big data. New York: Springer International Publishing.
Hanebeck, A. (2020). Nonparametric distribution function estimation. Master’s thesis, Karlsruher Institut für Technologie (KIT).
Hanebeck, A., Klar, B. (2020). Smooth distribution function estimation for lifetime distributions using Szasz-Mirakyan operators. arXiv:200509994 [math, stat] 2005.09994.
Helali, S., Slaoui, Y. (2020). Estimation of a distribution function using Lagrange polynomials with Tchebychev-Gauss points. Statistics and Its Interface, 13(3), 399–410.
Hogg, R. V., Klugman, S. A. (1984). Loss distributions. New York: Wiley-Interscience.
Jmaei, A., Slaoui, Y., Dellagi, W. (2017). Recursive distribution estimator defined by stochastic approximation method using bernstein polynomials. Journal of Nonparametric Statistics, 29, 792–805.
Johnson, N. L., Kotz, S., Balakrishnan, N. (1994). Continuous univariate distributions 2nd ed., Vol. 1. New York: Wiley-Interscience.
Johnson, N. L., Kotz, S., Balakrishnan, N. (1995). Continuous univariate distributions 2nd ed., Vol. 2. New York: Wiley-Interscience.
Kim, C., Kim, S., Park, M., Lee, H. (2006). A bias reducing technique in kernel distribution function estimation. Computational Statistics, 21(3), 589–601.
Leblanc, A. (2012). On estimating distribution functions using bernstein polynomials. Annals of the Institute of Statistical Mathematics, 64(5), 919–943.
Lockhart, R. (2013). The basics of nonparametric models. http://people.stat.sfu.ca/~lockhart/richard/830/13_3/lectures/nonparametric_basics/.
Lorentz, G. G. (1986). Bernstein polynomials 2nd ed. Co, New York, N.Y.: Chelsea Pub.
Marshall, A. W., Olkin, I. (2007). Life distributions: Structure of nonparametric, semiparametric, and parametric families. Springer Series in Statistics, New York: Springer.
Mokkadem, A., Pelletier, M., Slaoui, Y. (2009). The Stochastic approximation method for the estimation of a multivariate probability density. arXiv:08072960 [math, stat] 0807.2960.
Ouimet, F. (2020). A local limit theorem for the Poisson distribution and its application to the Le Cam distance between Poisson and Gaussian experiments and asymptotic properties of szasz estimators. arXiv:201005146 [math, stat] 2010.05146.
Parzen, E. (1962). On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33(3), 1065–1076.
Polansky, A. M., Baker, E. R. (2000). Multistage plug- in bandwidth selection for Kernel distribution function estimates. Journal of Statistical Computation and Simulation, 65(1–4), 63–80.
Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 27(3), 832–837.
Rudemo, M. (1982). Empirical choice of histograms and Kernel density estimators. Scandinavian Journal of Statistics, 9(2), 65–78.
Schwartz, S. C. (1967). Estimation of probability density by an orthogonal series. The Annals of Mathematical Statistics, 38(4), 1261–1265.
Slaoui, Y. (2014). Bandwidth selection for recursive Kernel density estimators defined by stochastic approximation method. https://www.hindawi.com/journals/jps/2014/739640/.
Stephanou, M., Varughese, M., Macdonald, I. (2017). Sequential quantiles via Hermite series density estimation. Electronic Journal of Statistics, 11(1), 570–607.
Szasz, O. (1950). Generalization of S. Bernstein’s polynomials to the infinite interval. Journal of Research of the National Bureau of Standards, 45, 239–245.
Tenreiro, C. (2006). Asymptotic behaviour of multistage plug-in bandwidth selections for Kernel distribution function estimators. Journal of Nonparametric Statistics, 18(1), 101–116.
Watson, G. S., Leadbetter, M. R. (1964). Hazard analysis ii. Sankhya The Indian Journal of Statistics Series A (1961–2002), 26(1), 101–116.
Yamato, H. (1973). Uniform convergence of an estimator of a distribution function. Bulletin of Mathematical Statistics, 15(3), 69–78.
Zhang, S., Li, Z., Zhang, Z. (2020). Estimating a distribution function at the boundary. Austrian Journal of Statistics, 49(1), 1–23.
Acknowledgements
The authors are grateful to two reviewers and the editors for their helpful remarks and comments on an earlier version of this manuscript. They are also sincerely grateful to Frédéric Ouimet for pointing out an error in a previous version of Lemma 3, for helpful discussions and for sharing his preprint Ouimet (2020).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The following theorem can be found in Ouimet (2020). He pointed out a mistake in the paper of Leblanc (2012) which also has an impact on this paper. The asymptotic behavior of \(R_{1,m}^S\) in Lemma 3 has been corrected compared to Lemma 3 in Hanebeck and Klar 2020, arXiv v.1.
Theorem 8
We define
Pick any \(\eta \in (0,1)\). Then, we have uniformly for \(k \in {\mathbb {N}}_0\) with \(\left| \frac{\delta _k}{\sqrt{mx}}\right| \le \eta\) that
as \(n \rightarrow \infty\).
We now present various properties of \(V_{k,m}\) that are needed for the proofs. The following lemma and its proof are similar to Lemma 2 and Lemma 3 in Leblanc (2012). As mentioned before, parts (e) and (h) take the suggestions in Ouimet (2020) into account. The proofs for these parts are adjusted accordingly.
Lemma 3
Define
and
and \(V_{k,m}(x)=e^{-mx}\frac{(mx)^k}{k!}\). It trivially holds that \(0 \le L_m^S(x) \le 1\) for \(x\in [0,\infty )\). In addition, the following properties hold.
-
(a)
\(L_m^S(0)=1\) and \(\displaystyle \lim _{x\rightarrow \infty } L_m^S(x)=0\),
-
(b)
\(R_{j,m}^S(0)=0\) for \(j\in \{0,1,2\}\),
-
(c)
\(0 \le R_{2,m}^S(x) \le \frac{x}{m} {\text { for }} x \in (0,\infty )\),
-
(d)
\(L_m^S(x)=m^{-1/2}\left[ (4\pi x)^{-1/2}+o_x(1)\right] {\text { for }} x\in (0,\infty )\),
-
(e)
\(\tilde{R}_{1,m}^S(x)= -\sqrt{\frac{x}{\pi }}+o_x(1) {\text { for }} x\in (0,\infty )\) and \(R_{1,m}^S(x)=m^{-1/2}\left[ -\sqrt{\frac{x}{4\pi }}+o_x(1)\right]\),
-
(f)
\(m^{1/2} \displaystyle \int _0^{\infty } L_m^S(x)e^{-ax}{\mathrm {d}}x =\frac{1}{2\sqrt{a}}+o(1)\) for \(a \in (0,\infty )\),
-
(g)
\(m^{1/2} \displaystyle \int _0^{\infty } x L_m^S(x)e^{-ax}{\mathrm {d}}x =\frac{1}{4a^{3/2}}+o(1)\) for \(a \in (0,\infty )\),
-
(h)
For any continuous and bounded function g on \([0,\infty )\), \(m^{1/2} \displaystyle \int _0^{\infty } g(x)R_{1,m}^S(x)e^{-ax}{\mathrm {d}}x = -\displaystyle \int _0^{\infty } g(x)\frac{\sqrt{x}}{\sqrt{4\pi }}e^{-ax}{\mathrm {d}}x+o(1)\) for \(a \in (0,\infty )\) and \(\displaystyle \int _0^{\infty } g(x)\tilde{R}_{1,m}^S(x)e^{-ax}{\mathrm {d}}x = -\displaystyle \int _0^{\infty } g(x)\frac{\sqrt{x}}{\sqrt{\pi }}e^{-ax}{\mathrm {d}}x+o(1)\).
About this article
Cite this article
Hanebeck, A., Klar, B. Smooth distribution function estimation for lifetime distributions using Szasz–Mirakyan operators. Ann Inst Stat Math 73, 1229–1247 (2021). https://doi.org/10.1007/s10463-020-00783-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-020-00783-y