Skip to main content
Log in

On the properties of hermite series based distribution function estimators

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

Hermite series based distribution function estimators have recently been applied in the context of sequential quantile estimation. These distribution function estimators are particularly useful because they allow the online (sequential) estimation of the full cumulative distribution function. This is in contrast to the empirical distribution function estimator and smooth kernel distribution function estimator which only allow sequential cumulative probability estimation at particular values on the support of the associated density function. Hermite series based distribution function estimators are well suited to the settings of streaming data, one-pass analysis of massive data sets and decentralised estimation. In this article we study these estimators in a more general context, thereby redressing a gap in the literature. In particular, we derive new asymptotic consistency results in the mean squared error, mean integrated squared error and almost sure sense. We also present novel Bias-robustness results for these estimators. Finally, we study the finite sample performance of the Hermite series based estimators through a real data example and simulation study. Our results indicate that in the general (non-sequential) context, the Hermite series based distribution function estimators are inferior to smooth kernel distribution function estimators, but may remain compelling in the context of sequential estimation of the full distribution function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Download: https://mikejareds.github.io/FXData/

References

Download references

Acknowledgements

We would like to sincerely thank the reviewers for their insightful and useful comments which helped us materially improve this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Stephanou.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Lemmas

The first lemma is due to Greblicki and Pawlak (1985) (Lemma 1) which is restated without proof as Lemma 1 below:

Lemma 1

(Greblicki and Pawlak (1985))

$$\begin{aligned} \lim _{N \rightarrow \infty }\sum _{k=0}^{N} a_{k}h_{k}(x) = f(x) \end{aligned}$$

at every differentiability point of f. If \(f \in L_{p}\), \(p>1\), the convergence holds for almost all \(x \in \mathbb {R}\).

The second Lemma is due to Liebscher (1990) (Lemma 5 in that paper) which is presented without proof as Lemma 2 below:

Lemma 2

(Liebscher (1990)) For the Hermite series estimators (3):

$$\begin{aligned} \sum _{k=0}^{N} E\left( \hat{a}_{k}-a_{k}\right) ^{2}=O\left( \frac{N^{1/2}}{n}\right) . \end{aligned}$$

The third Lemma is due to Greblicki and Pawlak (1984) (following from equation (15) in Theorem 4 of Greblicki and Pawlak (1984)), which we restate without proof as Lemma 3 below:

Lemma 3

(Greblicki and Pawlak 1984) For the Hermite series estimators (3), if \(E|X|^{s} < \infty \), \(s> 8(r+1)/3(2r+1)\) then:

$$\begin{aligned} \sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2} = O(n^{-2r/(2r+1)}\log n) \, \text{ a.s. } \end{aligned}$$

Finally, we present an important novel result with proof in Lemma 4 below. We will make use of Lemma 4 several times in this article.

Lemma 4

$$\begin{aligned} \int _{-\infty }^{x}|h_{k}(t)| dt \le 2 c_{1} (k+1)^{-\frac{1}{4}} + 12d_{1} (k+1)^{\frac{1}{2}}, \end{aligned}$$

where \(c_{1}\) and \(d_{1}\) are positive constants.

Proof

$$\begin{aligned} \int _{-\infty }^{x}|h_{k}(t)| dt\le & {} \int _{-\infty }^{\infty }|h_{k}(t)| dt\\= & {} \int _{-\infty }^{-1}|h_{k}(t)| dt +\int _{-1}^{1}|h_{k}(t)| dt +\int _{1}^{\infty }|h_{k}(t)| dt \\= & {} \int _{-1}^{1}|h_{k}(t)| dt +2\int _{1}^{\infty }|h_{k}(t)| dt \\\le & {} 2c_{1} (k+1)^{-\frac{1}{4}} + \frac{2d_{1}}{b} (k+1)^{\frac{5}{12}+\frac{b}{2}}, \, b>0. \end{aligned}$$

This follows from the inequalities implied by Theorem 8.91.3 of Szego (1975), namely, \(\max _{|x|\le a} h_{k}(x) \le c_{a} (k+1)^{-\frac{1}{4}}\) and \(\max _{|x|\ge a}|h_{k}(x)|x^{\lambda } \le d_{a} (k+1)^{s} \) where \(c_{a}\) and \(d_{a}\) are positive constants depending only on a, \(s=\max (\frac{\lambda }{2} - \frac{1}{12}, -\frac{1}{4})\) and we have set \(\lambda =1+b, \, b>0\). In addition, we have made use of \(\int _{1}^{\infty } x^{-1-b} = \frac{1}{b}, \, b>0\). For concreteness we have set \(b=\frac{1}{6}\). \(\square \)

B Proofs of propositions and theorems

1.1 B.1 Proof of Proposition 1

Proof

$$\begin{aligned} \left| E[\hat{F}_{N}(x)] - F(x) \right|= & {} \left| E\left[ \int _{-\infty }^{x} \hat{f}_{N}(t) dt\right] - \int _{-\infty }^{x} f(t) dt \right| \\\le & {} \int _{-\infty }^{x}\sum _{k=N+1}^{\infty } |a_k| |h_{k}(t)| dt. \end{aligned}$$

This follows from (3), (4), the fact that \(E(\hat{a}_k)=a_k\) and Lemma 1. By the monotone convergence theorem we have,

$$\begin{aligned} \int _{-\infty }^{x}\sum _{k=N+1}^{\infty } |a_k| |h_{k}(t)| dt = \sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt. \end{aligned}$$

Utilising Lemma 4 we have:

$$\begin{aligned}&\sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt \\&\quad \le 2c_{1}\sum _{k=N+1}^{\infty } |a_k| (k+1)^{-\frac{1}{4}} + 12d_{1} \sum _{k=N+1}^{\infty } |a_k| (k+1)^{\frac{1}{2}} \\&\quad \le 2c_{1}\sum _{k=N+1}^{\infty } |b_{k+r}| (k+1)^{-\frac{1}{4}-\frac{r}{2}} + 12d_{1} \sum _{k=N+1}^{\infty } |b_{k+r}| (k+1)^{\frac{1}{2} - \frac{r}{2}} \\&\quad \le 2c_{1}||\left( x-\frac{d}{dx}\right) ^r f(x)|| \sqrt{\sum _{k=N+1}^{\infty } (k+1)^{-\frac{1}{2}-r}} \\&\qquad + 12d_{1} ||\left( x-\frac{d}{dx}\right) ^r f(x)|| \sqrt{\sum _{k=N+1}^{\infty } (k+1)^{1- r}}, \end{aligned}$$

where we have also used the fact that by assumption, \((x-\frac{d}{dx})^r f(x) \in L_{2}\) and Walter (1977) has shown \(a_{k}^{2} \le \frac{b_{k+r}^{2}}{(k+1)^r}\), where \(b_k\) is the k-th coefficient of the expansion of \((x-\frac{d}{dx})^r f(x) \in L_{2}\). In addition, we have utilised Parseval’s theorem, \(||(x-\frac{d}{dx})^r f(x)||^{2} = \sum _{k=0}^{\infty } b_{k}^{2}\), and the Cauchy–Schwarz inequality in the last line. Using the well known properties of the Hurwitz Zeta function, \(\zeta (s,a) = \sum _{k=0}^{\infty }(k+a)^{-s}\), (see DLMF 2017, 25.11.43), we have:

$$\begin{aligned} \sum _{k=N+1}^{\infty } |a_k| \int _{-\infty }^{x}|h_{k}(t)| dt = O(N^{-r/2 +1}), \end{aligned}$$
(7)

completing the proof. \(\square \)

1.2 B.2 Proof of Proposition 2

Proof

It is easy to see that

$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right|= & {} \left| \sum _{k=0}^{N} (\hat{a}_{k}-a_{k}) \int _{-\infty }^{x}h_k(t)dt \right| \\\le & {} \sqrt{\sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2}} \sqrt{\sum _{k=0}^{N}\left| \int _{-\infty }^{x}h_k(t)dt\right| ^2}. \end{aligned}$$

Now by virtue of Lemma 4 we have:

$$\begin{aligned} \left[ \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right] ^{2} = \sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2} O\left( N^{2}\right) . \end{aligned}$$
(8)

Making use of Lemma 2 we have,

$$\begin{aligned} E\left[ \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right] ^{2} = O\left( \frac{N^{\frac{5}{2}}}{n}\right) . \end{aligned}$$
(9)

\(\square \)

1.3 B.3 Proof of Theorem 3

Proof

We begin by restating the definition of the rate of almost sure convergence provided in Greblicki and Pawlak (1984): for a sequence of random variables \(Y_{n}\), we say that \(Y=O(a_{n})\) almost surely if \(\frac{\beta {n} Y_{n}}{a_n} \rightarrow 0\) almost surely as \(n \rightarrow \infty \), for all (non-negative) sequences \(\{\beta _{n}\}\) convergent to zero. Now,

$$\begin{aligned} \left| \hat{F}_{N}(x) - F(x) \right|\le & {} \left| E[\hat{F}_{N}(x)] - F(x) \right| + \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| . \end{aligned}$$

By Proposition 1,

$$\begin{aligned}&\left| E[\hat{F}_{N}(x)] - F(x) \right| = O(N^{-r/2 +1}) \\&\quad =O(n^{-(r-2)/(2r+1)}). \end{aligned}$$

In addition, via (8) we have:

$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| =\sqrt{\sum _{k=0}^{N} \left( \hat{a}_{k}-a_{k}\right) ^{2}} O\left( N\right) . \end{aligned}$$

We make use of Lemma 3 to obtain,

$$\begin{aligned} \left| \hat{F}_{N}(x) - E[\hat{F}_{N}(x)] \right| = O(n^{-(r-2)/(2r+1)} \log n) \, a.s., \end{aligned}$$

and finally:

$$\begin{aligned} \left| \hat{F}_{N}(x) - F(x) \right| =O(n^{-(r-2)/(2r+1)} \log n) \, a.s. \end{aligned}$$

\(\square \)

1.4 B.4 Proof of Theorem 4

Proof

It suffices to prove \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \right) < \infty \) for all \(\epsilon > 0\) (Borel-Cantelli). We have via the law of total probability,

$$\begin{aligned}&\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)|> \epsilon \right) \\&\quad = \sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)|> \epsilon \big | N(n)> cn^{\gamma }\right) P\left( N(n)> cn^{\gamma } \right) \\&\qquad + \sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \big | N(n) \le cn^{\gamma } \right) P\left( N(n) \le cn^{\gamma } \right) ,\, \end{aligned}$$

where c is a constant. By the assumption that \(\sum _{n=1}^{\infty } P\left( \frac{N(n)}{n^{\gamma }} > \epsilon \right) < \infty \) for all \(\epsilon > 0\) , it is clear that the first term is finite. It remains to show that \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon \big | N(n) \le cn^{\gamma } \right) < \infty \) for all \(\epsilon > 0\). By the conditional Markov inequality we have:

$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} E \left| \int _{-\infty }^{x} \sum _{k=0}^{q(n)} (\hat{a}_k-a_k) h_{k}(t) dt - \int _{-\infty }^{x} \sum _{k=q(n)+1}^{\infty } a_k h_k(t) dt \right| ^p, \end{aligned}$$

for all \(\epsilon > 0\). Using the fact that \(|f+g|^{p} \le 2^{p-1} (|f|^{p} +|g|^{p})\) along with the Hölder inequality, Lemma 4 and Proposition 1 we have,

$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} 2^{p-1} \left( \sum _{k=0}^{q(n)} E|\hat{a}_k-a_k |^{p}\right) \left( \sum _{k=0}^{q(n)} \left( \int _{-\infty }^{x}|h_{k}(t)| dt\right) ^{p/(p-1)}\right) ^{p-1}\\&\qquad + \epsilon ^{-p} 2^{p-1} \left| \int _{-\infty }^{x} \sum _{k=q(n)+1}^{\infty } a_k h_k(t) dt \right| ^p \\&\quad \le \epsilon ^{-p} 2^{p-1} b_{1} \left( \sum _{k=0}^{q(n)} E|\hat{a_k}-a_k |^{p}\right) \left( \sum _{k=0}^{q(n)} (k+1)^{p/2(p-1)}\right) ^{p-1} \\&\qquad + \epsilon ^{-p} 2^{p-1} b_{2} q(n)^{-rp/2 + p}, \end{aligned}$$

for all \(\epsilon > 0\) , where \(b_{1}, b_{2}\) are positive constants. Now, the results of Dharmadhikari and Jogdeo (1969) for independent random variables, \(X_{i}\), with zero mean imply that (Theorem 2 in that paper):

$$\begin{aligned} E\left| \sum _{i=1}^{n} X_{i} \right| ^{\nu } \le F_{\nu } n^{\nu /2 - 1}\sum _{i=1}^{n} E\left| X_{i}\right| ^{\nu }, \end{aligned}$$

where \(\nu \ge 2\) and \(F_{\nu }\) is a constant depending only on \(\nu \). Thus we have \(E|\hat{a_k}-a_k |^{p} = n^{-p} E|\sum _{i=1}^{n} (h_k(\mathbf {x_i}) - a_k)|^{p} \le F_{p} n^{-p/2-1}\sum _{i=1}^{n} E|h_k(\mathbf {x_i}) - a_k|^{p} \), where \(F_{p}\) is a constant depending only on p. Also noting that \(\max _{x} |h_{k}(x)| \le C (k+1)^{-1/12}\) where C is a positive constant (implied by Theorem 8.91.3 of Szego 1975), we have:

$$\begin{aligned}&P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) \\&\quad \le \epsilon ^{-p} 2^{p-1} b_{3} n^{-p/2} q(n)^{-p/12+1} q(n)^{3p/2 -1}+ \epsilon ^{-p} 2^{p-1} b_{2} q(n)^{-rp/2 + p}, \end{aligned}$$

for all \(\epsilon > 0\), where \(b_{3}\) depends only on p. It is easy to see that for \(r>2\) and \(q(n) = O(n^{\gamma })\), \(0< \gamma < 6/17\), we can choose p such that \(\sum _{n=1}^{\infty } P\left( |\hat{F}_{N(n)} (x)- F(x)| > \epsilon | N(n) = q(n) \right) < \infty \) for all \(\epsilon > 0\). \(\square \)

1.5 B.5 Proof of Proposition 3

Proof

The fixed N hermite series estimator (4) (equal to (5)) can be represented as:

$$\begin{aligned} T(x,\hat{F}_{n}) =\int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} d_{N}(t,y) dy \right] d\hat{F}_{n}(t), \end{aligned}$$

where \(d_{N}(t,y) = \sum _{k=0}^{N} h_{k}(t) h_{k}(y)\). The influence function and empirical influence function are:

$$\begin{aligned} IF(x,x';T,F)= & {} \int _{-\infty }^{x} d_{N}(x',y) dy - \int _{-\infty }^{\infty }\int _{-\infty }^{x} d_{N}(t,y) dy dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \int _{-\infty }^{x} d_{N}(x',y) dy -\int _{-\infty }^{\infty } \int _{-\infty }^{x} d_{N}(t,y) dy d\hat{F}_{n}(t). \end{aligned}$$

Now, for fixed N,

$$\begin{aligned} |\int _{-\infty }^{x} d_{N}(t,y) dy|\le & {} \sum _{k=0}^{N} |h_{k}(t)| \int _{\infty }^{x} |h_{k}(y)| \nonumber \\\le & {} u_1 \sum _{k=0}^{N} (k+1)^{-1/12-1/4} + v_{1} \sum _{k=0}^{N} (k+1)^{1/2-1/12} \nonumber \\< & {} \infty , \end{aligned}$$
(10)

where \(u_1\) and \(v_1\) are constants. The result (10) follows from Lemma 4 and the fact that \(\max _{x} |h_{k}(t)| \le C (k+1)^{-1/12}\). Thus, the gross-error sensitivities, \(\sup _{x'} |IF(x,x';T,F)|< \infty \) and \(\sup _{x'} |IF(x,x';T,\hat{F}_{n})| < \infty \) and the fixed N Hermite series cumulative distribution function estimator is Bias-robust. \(\square \)

1.6 B.6 Proof of Proposition 4

Proof

The Kernel distribution function estimator is defined as:

$$\begin{aligned} \hat{F}(x) = \frac{1}{n} \sum _{i=1}^{n} \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{\mathbf {x}_{i}-y}{h}\right) dy. \end{aligned}$$

This has the representation:

$$\begin{aligned} T(x,\hat{F}_{n}) = \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] d\hat{F}_{n}(t), \end{aligned}$$

where \(\hat{F}_{n}\) is the empirical distribution function. The influence function and empirical influence function are easily seen to be:

$$\begin{aligned} IF(x,x';T,F)= & {} \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{x'-y}{h}\right) dy\right] \\&- \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{x'-y}{h}\right) dy\right] \\&- \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \frac{1}{h} K\left( \frac{t-y}{h}\right) dy\right] d\hat{F}_{n}(t). \end{aligned}$$

Since \(\int _{-\infty }^{\infty } K(u) du = 1\), it is clear that \(\sup _{x'} |IF(x,x';T,F)| \le 2 < \infty \), \(\sup _{x'}|IF(x,x';T,\hat{F}_{n})| \le 2 < \infty \), and thus the smooth kernel distribution function estimator is Bias-robust. \(\square \)

1.7 B.7 Proof of Proposition 5

Proof

Suppose a density function, f(x), can be expanded formally as:

$$\begin{aligned} f(x)= & {} \sum _{k=0}^{\infty } c_{k} He_{k}(x) \phi (x),\\ c_{k}= & {} \frac{1}{k!} \int _{-\infty }^{\infty } f(x) He_{k}(x) dx, \end{aligned}$$

where \(\phi (x)=\frac{e^{-x^2/2}}{\sqrt{2\pi }}\) and \(He_{k}(x)\) are the Chebyshev-Hermite polynomials (following the notation of Szego (1975)). The truncated expansion has the form:

$$\begin{aligned} f(x)=\sum _{k=0}^{N} c_{k} He_{k} (x) \phi (x), \end{aligned}$$

usually truncated to obtain:

$$\begin{aligned} f(x)=\phi (x) (1+\frac{1}{2} (\mu _{2}-1)He_{2}(x) + \frac{1}{6} \mu _{3} He_{3}(x) +\frac{1}{24} (\mu _{4}-6\mu _{2}+3)He_{4}), \end{aligned}$$

where \(\mu _{2},\mu _{3},\mu _{4}\) are non-central moments. This is the Gram–Charlier series of Type A Kendall et al. (1987). A natural cumulative distribution function estimator based on the Gram–Charlier series is:

$$\begin{aligned} \hat{F}_{N}(x) = \sum _{k=0}^{N} \hat{c}_{k} \int _{-\infty }^{x} He_{k}(y) \phi (y) dy. \end{aligned}$$
(11)

This has the representation

$$\begin{aligned} T(x,\hat{F}_{n}) = \int _{-\infty }^{\infty } \left[ \int _{-\infty }^{x} \sum _{k=0}^{N}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy \right] d\hat{F}_{n}(t). \end{aligned}$$

Now:

$$\begin{aligned} IF(x,x';T,F)= & {} \sum _{k=0}^{N} \frac{1}{k!} He_{k}(x')\int _{-\infty }^{x} He_{k}(y) \phi (y) dy \\&- \int _{-\infty }^{\infty } \int _{-\infty }^{x}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy dF(t),\\ IF(x,x';T,\hat{F}_{n})= & {} \sum _{k=0}^{N} \frac{1}{k!} He_{k}(x')\int _{-\infty }^{x} He_{k}(y) \phi (y) dy \\&- \int _{-\infty }^{\infty }\int _{-\infty }^{x}\frac{1}{k!} He_{k}(t)He_{k}(y) \phi (y) dy d\hat{F}_{n}. \end{aligned}$$

Since \(He_{k}(x')\) is not bounded, whereas the second terms of \(IF(x,x';T,F)\) and \(IF(x,x';T,\hat{F}_{n})\) are bounded, the gross-error sensitivities, \(\sup _{x'} |IF(x,x';T,F)|\) and \(\sup _{x'} |IF(x,x';T,\hat{F}_{n})|\) are not bounded and thus the CDF estimator (11) is not Bias-robust. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stephanou, M., Varughese, M. On the properties of hermite series based distribution function estimators. Metrika 84, 535–559 (2021). https://doi.org/10.1007/s00184-020-00785-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-020-00785-z

Keywords

Mathematics Subject Classification

Navigation