Skip to main content
Log in

New type of gamma kernel density estimator

  • Research Article
  • Published:
Journal of the Korean Statistical Society Aims and scope Submit manuscript

Abstract

We discuss a new kernel type estimator for density function \(f_X(x)\) with nonnegative support. Here, we use a type of gamma density as a kernel function and modify it with expansions of exponential and logarithmic functions. Our modified gamma kernel density estimator is not only free of the boundary bias, but the variance is also in smaller orders, which are \(O(n^{-1}h^{-1/4})\) in the interior and \(O(n^{-1}h^{-3/4})\) in the boundary region. Furthermore, the optimal orders of its mean squared error are \(O(n^{-8/9})\) in the interior and \(O(n^{-8/11})\) in the boundary region. Simulation results that demonstrate the proposed method’s performances are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

Download references

Acknowledgements

The authors would like to thank the two anonymous referees and the editor-in-chief for their careful reading and valuable comments, which improved the manuscript.

Funding

This work was supported by JSPS Grant-in-Aid for Scientific Research (B) [Grant number 16H02790].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoshihiko Maesono.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof of Theorem 2.2

First, by usual reasoning of i.i.d. random variables, we have

$$\begin{aligned} E[A_h(x)]=\int _{0}^\infty \frac{w^{\frac{1}{\sqrt{h}}-1}e^{-\frac{w}{x \sqrt{h}+h}}}{\Gamma \left( \frac{1}{\sqrt{h}}\right) (x\sqrt{h}+h)^\frac{1}{ \sqrt{h}}}f_X(w)\mathrm {d}w. \end{aligned}$$

If we define a random variable \(W\sim Gamma(h^{-1/2},x\sqrt{h}+h)\) with mean

$$\begin{aligned} \mu _W=h^{-1/2}(x\sqrt{h}+h), \end{aligned}$$

\(Var(W)=h^{-1/2}(x\sqrt{h}+h)^2\), and \(E[(W-\mu _W)^3]=2h^{-1/2}(x\sqrt{h}+h)^3\), we can see the integral as an expectation of \(f_X(W)\), and we are then able to use Taylor expansion twice, first around \(\mu _W\), and next around x. This results in

$$\begin{aligned} E[f_X(W)]= & {} E\left[ f_X(\mu _W)+f_X'(\mu _W)(W-\mu _W) +\frac{f_X''(\mu _W)}{2}(W-\mu _W)^2+...\right] \\= & {} f_X(x+\sqrt{h})+\frac{1}{2}f_X''(x+\sqrt{h})\frac{1}{\sqrt{h}}(x\sqrt{h}+h)^2+...\\= & {} f_X(x)+\left[ f_X'(x)+\frac{1}{2}x^2f_X''(x)\right] \sqrt{h}+o(\sqrt{h}). \end{aligned}$$

Hence, we have

$$\begin{aligned} Bias[A_h(x)]=\left[ f_X'(x)+\frac{1}{2}x^2f_X''(x)\right] \sqrt{h}+o(\sqrt{h}), \end{aligned}$$

which is in the order of \(\sqrt{h}\).

Next, we derive the formula of the variance, which is

$$\begin{aligned} Var[A_h(x)]=n^{-1}E[K^2(X_1;x,h)]+O(n^{-1}). \end{aligned}$$

First, we take a look at the expectation part,

$$\begin{aligned} E[K^2(X_1;x,h)]= & {} \int _{0}^\infty \frac{v^{\frac{2}{\sqrt{h}}-2} e^{-\frac{2v}{x\sqrt{h}+h}}}{\Gamma ^2\left( \frac{1}{\sqrt{h}}\right) (x\sqrt{h}+h)^\frac{2}{\sqrt{h}}}f_X(v)\mathrm {d}v\\= & {} \frac{\Gamma \left( \frac{2}{\sqrt{h}}-1\right) \left( \frac{x\sqrt{h} +h}{2}\right) ^{\frac{2}{\sqrt{h}}-1}}{\Gamma ^2\left( \frac{1}{\sqrt{h}}\right) (x\sqrt{h}+h)^\frac{2}{\sqrt{h}}}\int _{0}^\infty \frac{v^{\left( \frac{2}{\sqrt{h}} -1\right) -1}e^{-\frac{2v}{x\sqrt{h}+h}}}{\Gamma \left( \frac{2}{\sqrt{h}}-1\right) \left( \frac{x\sqrt{h}+h}{2}\right) ^{\frac{2}{\sqrt{h}}-1}}f_X(v)\mathrm {d}v\\= & {} B(x,h)E[f_X(V)], \end{aligned}$$

where V is a \(Gamma(2h^{-1/2}-1,(x\sqrt{h}+h)/2)\) random variable, B(xh) is a factor outside the integral, and the integral itself can be considered as \(E[f_X(V)]\). Similar as before, the random variable V has mean \(\mu _V=(2h^{-1/2}-1)(x\sqrt{h}+h/2)\) and

$$\begin{aligned} Var(V)=(2h^{-1/2}-1)(x\sqrt{h}+h)^2/4. \end{aligned}$$

In the same fashion as in \(E[f_X(W)]\) before, we have

$$\begin{aligned} E[f_X(V)]= & {} f_X\left( x+\sqrt{h}-\frac{x\sqrt{h}+h}{2}\right) +\frac{1}{2}f_X''\left( x+\sqrt{h}-\frac{x\sqrt{h}+h}{2}\right) \\&\; \; \; \times \left( \frac{2}{\sqrt{h}}-1\right) \left( \frac{x\sqrt{h}+h}{2}\right) ^2+...\\= & {} f_X(x)+O(\sqrt{h}). \end{aligned}$$

Now, let \(R(z)=\frac{\sqrt{2\pi }z^{z+\frac{1}{2}}}{e^z\Gamma (z+1)}\); then, B(xh) can be rewritten to become

$$\begin{aligned} B(x,h)= & {} \frac{\sqrt{2\pi }\left( \frac{2}{\sqrt{h}}-2\right) ^{\frac{2}{\sqrt{h}}-\frac{3}{2}}}{e^{\frac{2}{\sqrt{h}}-2}R\left( \frac{2}{\sqrt{h}}-2\right) }\frac{e^{\frac{2}{\sqrt{h}}-2}R^2\left( \frac{1}{\sqrt{h}}-1\right) }{2\pi \left( \frac{1}{\sqrt{h}}-1\right) ^{\frac{2}{\sqrt{h}}-1}} \frac{1}{2^{\frac{2}{\sqrt{h}}-1}(x\sqrt{h}+h)}\\= & {} \frac{R^2\left( \frac{1}{\sqrt{h}}-1\right) }{2(x+\sqrt{h}) \sqrt{\pi (1-\sqrt{h})}R\left( \frac{2}{\sqrt{h}}-2\right) h^{\frac{1}{4}}}. \end{aligned}$$

Thus, we obtain Eq. 13, and the proof is completed.

Proof of Theorem 2.4

We have already expanded \(J_h(x)\) until the \(\sqrt{h}\) term. Now, extending it until the h term results in

$$\begin{aligned} J_h(x)= & {} f_X(x)+\sqrt{h}f_X'(x)+o(\sqrt{h})+\frac{1}{2}(x^2\sqrt{h}+2xh+h\sqrt{h})\\&\; \; \; \times [f_X''(x)+\sqrt{h}f_X'''(x)+o(\sqrt{h})]+...\\= & {} f_X(x)\bigg [1+\left\{ f_X'(x)+\frac{1}{2}x^2f_X''(x)\right\} \frac{\sqrt{h}}{f_X(x)}\\&\; \; \; +\left\{ \left( x+\frac{1}{2}\right) f_X''(x)+x^2\left( \frac{x}{3} +\frac{1}{2}\right) f_X'''(x)\right\} \frac{h}{f_X(x)}+o(h)\bigg ]\\= & {} f_X(x)\left[ 1+\frac{a(x)}{f_X(x)}\sqrt{h}+\frac{b(x)}{f_X(x)}h+o(h)\right] , \end{aligned}$$

where \(a(x)=f_X'(x)+\frac{1}{2}x^2f_X''(x)\), and \(b(x)=\left( x+\frac{1}{2}\right) f_X''(x)+x^2\left( \frac{x}{3}+\frac{1}{2} \right) f_X'''(x)\). By taking the natural logarithm and using its expansion, we have

$$\begin{aligned} \log J_h(x)= & {} \log f_X(x)+\sum _{k=1}^{\infty }\frac{(-1)^{k-1}}{k} \left[ \frac{a(x)}{f_X(x)}\sqrt{h}+\frac{b(x)}{f_X(x)}h+o(h)\right] ^k\\= & {} \log f_X(x)+\frac{a(x)}{f_X(x)}\sqrt{h}+\frac{b(x)}{f_X(x)}h+o(h) -\frac{1}{2}\left[ \frac{a(x)}{f_X(x)}\sqrt{h}+\frac{b(x)}{f_X(x)}h+o(h)\right] ^2\\&\; \; \; +\frac{1}{3}\left[ \frac{a(x)}{f_X(x)}\sqrt{h}+\frac{b(x)}{f_X(x)}h+o(h)\right] ^3-...\\= & {} \log f_X(x)+\frac{a(x)}{f_X(x)}\sqrt{h}+\left[ b(x)-\frac{a^2(x)}{2f_X(x)} \right] \frac{h}{f_X(x)}+o(h). \end{aligned}$$

Next, if we define \(J_{4h}(x)=E[A_{4h}(x)]\) (using quadrupled bandwidth), i.e.,

$$\begin{aligned} \ln J_{4h}(x)=\ln f_X(x)+\frac{2a(x)}{f_X(x)}\sqrt{h}+\frac{4}{f_X(x)}\left[ b(x) -\frac{a^2(x)}{2f_X(x)}\right] h+o(h), \end{aligned}$$

we can set up conditions to eliminate the term \(\sqrt{h}\) while keeping the term \(\ln f_X(x)\). Now, since \(\ln [J_h(x)]^{t_1}[J_{4h}(x)]^{t_2}\) equals

$$\begin{aligned} (t_1+t_2)\ln f_X(x)+(t_1+2t_2)\frac{a(x)}{f_X(x)}\sqrt{h}+(t_1+4t_2)\left[ b(x) -\frac{a^2(x)}{2f_X(x)}\right] \frac{h}{f_X(x)}+o(h), \end{aligned}$$

the conditions we need are \(t_1+t_2=1\) and \(t_1+2t_2=0\). It is obvious that the solution is \(t_1=2\) and \(t_2=-1\), and we get

$$\begin{aligned} \ln [J_h(x)]^2[J_{4h}(x)]^{-1}=\ln f_X(x)-\frac{2}{f_X(x)}\left[ b(x)-\frac{a^2(x)}{2f_X(x)}\right] h+o(h). \end{aligned}$$

If we take the exponential function and use its expansion, we have

$$\begin{aligned}{}[J_h(x)]^2[J_{4h}(x)]^{-1}= & {} f_X(x)\sum _{k=0}^{\infty }\frac{(-1)^k}{k!} \left[ \frac{2}{f_X(x)}\left\{ b(x)-\frac{a^2(x)}{2f_X(x)}\right\} h+o(h)\right] ^k\\= & {} f_X(x)\left[ 1-\frac{2}{f_X(x)}\left\{ b(x)-\frac{a^2(x)}{2f_X(x)}\right\} h +o(h)\right] \\= & {} f_X(x)-2\left[ b(x)-\frac{a^2(x)}{2f_X(x)}\right] h+o(h)\\= & {} f_X(x)+O(h). \end{aligned}$$

Proof of Theorem 2.6

Because of the definition of \(J_h(x)\) and \(J_{4h}(x)\), we can rewrite \(A_h(x)=J_h(x)+Y\) and \(A_{4h}(x)=J_{4h}(x)+Z\), where Y and Z are random variables with E(Y) and E(Z) are both 0, \(Var(Y)=Var[A_h(x)]\), and \(Var(Z)=Var[A_{4h}(x)]\). Then, by the expansion \((1+p)^q=1+pq+O(p^2)\), we get

$$\begin{aligned} {\widetilde{f}}_X(x)= & {} [J_h(x)]^2[J_{4h}(x)]^{-1}\left[ 1+\frac{Y}{J_h(x)} \right] ^2\left[ 1+\frac{Z}{J_{4h}(x)}\right] ^{-1}\\= & {} [J_h(x)]^2[J_{4h}(x)]^{-1}\left[ 1+\frac{2Y}{J_h(x)} +O\left\{ \frac{Y^2}{J_h^2(x)}\right\} \right] \left[ 1 -\frac{Z}{J_{4h}(x)}+O\left\{ \frac{Z^2}{J_{4h}^2(x)}\right\} \right] \\= & {} [J_h(x)]^2[J_{4h}(x)]^{-1}+\frac{2J_h(x)}{J_{4h}(x)}Y -\left[ \frac{J_h(x)}{J_{4h}(x)}\right] ^2Z+O[(Y+Z)^2]. \end{aligned}$$

Hence,

$$\begin{aligned} E[{\widetilde{f}}_X(x)]= & {} [J_h(x)]^2[J_{4h}(x)]^{-1}+\frac{2J_h(x)}{J_{4h}(x)}E(Y)-\left[ \frac{J_h(x)}{J_{4h}(x)}\right] ^2E(Z)+O[E\{(Y+Z)^2\}]\\= & {} f_X(x)-2\left[ b(x)-\frac{a(x)}{2f_X(x)}\right] h+o(h) +O\left( \frac{1}{nh^{\frac{1}{4}}}\right) , \end{aligned}$$

and its bias is

$$\begin{aligned} Bias[{\widetilde{f}}_X(x)]=-2\left[ b(x)-\frac{a(x)}{2f_X(x)}\right] h +o(h)+O\left( \frac{1}{nh^{\frac{1}{4}}}\right) . \end{aligned}$$

Proof of Theorem 2.7

By usual calculation of i.i.d. random variables, we have

$$\begin{aligned} Cov[A_h(x),A_{4h}(x)]=\frac{1}{n}E[K(X_1;x,h)K(X_1;x,4h)]+O\left( \frac{1}{n}\right) . \end{aligned}$$

Now, for the expectation,

$$\begin{aligned} E[K(X_1;x,h)K(X_1;x,4h)]= & {} \int _{0}^{\infty }\frac{t^{\frac{1}{\sqrt{h}} -1}e^{-\frac{t}{x\sqrt{h}+h}}}{\Gamma \left( \frac{1}{\sqrt{h}}\right) (x\sqrt{h} +h)^\frac{1}{\sqrt{h}}}\frac{t^{\frac{1}{2\sqrt{h}}-1}e^{-\frac{t}{2x\sqrt{h} +4h}}}{\Gamma \left( \frac{1}{2\sqrt{h}}\right) (2x\sqrt{h} +4h)^\frac{1}{2\sqrt{h}}}f_X(t)\mathrm {d}t\\= & {} \frac{\Gamma \left( \frac{3}{2\sqrt{h}}-1\right) \left[ \frac{2\sqrt{h}(x +\sqrt{h})(x+2\sqrt{h})}{3x+5\sqrt{h}}\right] ^{\frac{3}{2\sqrt{h}}-1}}{\Gamma \left( \frac{1}{\sqrt{h}}\right) \Gamma \left( \frac{1}{2\sqrt{h}}\right) (x\sqrt{h}+h)^\frac{1}{\sqrt{h}}(2x\sqrt{h}+4h)^{\frac{1}{2\sqrt{h}}}}\\&\; \; \; \times \int _{0}^\infty \frac{t^{\left( \frac{3}{2\sqrt{h}}-1\right) -1}e^{-t\left[ \frac{3x+5\sqrt{h}}{2\sqrt{h}(x+\sqrt{h})(x+2\sqrt{h})}\right] }}{\Gamma \left( \frac{3}{2\sqrt{h}}-1\right) \left[ \frac{2\sqrt{h}(x+\sqrt{h})(x +2\sqrt{h})}{3x+5\sqrt{h}}\right] ^{\frac{3}{2\sqrt{h}}-1}}f_X(t)\mathrm {d}t\\= & {} C(x,h)E[f_X(T)], \end{aligned}$$

where C(xh) is the factor outside the integral, and T is a random variable with mean

$$\begin{aligned} \mu _T=\frac{3(x+\sqrt{h})(x+2\sqrt{h})}{3x+5\sqrt{h}}+O(\sqrt{h}) \end{aligned}$$

and variance \(Var(T)=O(\sqrt{h})\). Utilizing Taylor expansion results in

$$\begin{aligned} E[f_X(T)]= & {} f_X(x)+\left[ \frac{3(x+\sqrt{h})(x+2\sqrt{h})}{3x+5\sqrt{h}} -x+O(\sqrt{h})\right] f_X'(x)+o(\sqrt{h})\\&\; \; \; +\frac{1}{2}f_X''\left[ \frac{3(x+\sqrt{h})(x+2\sqrt{h})}{3x +5\sqrt{h}}+O(\sqrt{h})\right] O(\sqrt{h})\\= & {} f_X(x)+O(\sqrt{h}). \end{aligned}$$

Using the definition of R(z) as before, we get

$$\begin{aligned} C(x,h)= & {} \frac{[2\sqrt{h}(x+\sqrt{h})(x+2\sqrt{h})]^{\frac{3}{2\sqrt{h}}-1}}{(x\sqrt{h}+h)^\frac{1}{\sqrt{h}}(2x\sqrt{h}+4h)^\frac{1}{2\sqrt{h}}(3x +5\sqrt{h})^{\frac{3}{2\sqrt{h}}-1}}\\&\times \frac{\sqrt{2\pi }\left( \frac{3}{2\sqrt{h}}-2\right) ^{\frac{3}{2\sqrt{h}} -\frac{3}{2}}}{e^{\frac{3}{2\sqrt{h}}-2}R\left( \frac{3}{2\sqrt{h}}-2\right) } \frac{e^{\frac{1}{\sqrt{h}}-1}R\left( \frac{1}{\sqrt{h}}-1\right) }{\sqrt{2\pi } \left( \frac{1}{\sqrt{h}}-1\right) ^{\frac{1}{\sqrt{h}}-\frac{1}{2}}} \frac{e^{\frac{1}{2\sqrt{h}}-1}R\left( \frac{1}{2\sqrt{h}}-1\right) }{\sqrt{2\pi }\left( \frac{1}{2\sqrt{h}}-1\right) ^{\frac{1}{2\sqrt{h}}-\frac{1}{2}}} \\= & {} \frac{R\left( \frac{1}{\sqrt{h}}-1\right) R\left( \frac{1}{2\sqrt{h}}-1\right) }{2h^{\frac{1}{4}}\sqrt{\pi }R\left( \frac{3}{2\sqrt{h}}-2\right) (3x+5\sqrt{h})}\frac{\left( \frac{3}{2}-2\sqrt{h}\right) ^{\frac{3}{2\sqrt{h}} -\frac{3}{2}}}{(2-2\sqrt{h})^{\frac{1}{\sqrt{h}}-\frac{1}{2}}(1 -2\sqrt{h})^{\frac{1}{2\sqrt{h}}-\frac{1}{2}}}\\&\times \left( \frac{x+\sqrt{h}}{3x+5\sqrt{h}}\right) ^{\frac{1}{2\sqrt{h}} -1}\left( \frac{2x+4\sqrt{h}}{3x+5\sqrt{h}}\right) ^{\frac{1}{\sqrt{h}}-1}, \end{aligned}$$

when \(x>h\) (for \(x\le h\), the calculation is similar). Hence, the covariance term is

$$\begin{aligned} Cov[A_h(x),A_{4h}(x)]=\frac{R\left( \frac{1}{\sqrt{h}}-1\right) R \left( \frac{1}{2\sqrt{h}}-1\right) }{2\sqrt{\pi }R\left( \frac{3}{2\sqrt{h}} -2\right) (3x+5\sqrt{h})}\frac{\left( \frac{3}{2}-2\sqrt{h}\right) ^{\frac{3}{2\sqrt{h}} -\frac{3}{2}}}{(2-2\sqrt{h})^{\frac{1}{\sqrt{h}}-\frac{1}{2}} (1-2\sqrt{h})^{\frac{1}{2\sqrt{h}}-\frac{1}{2}}}\\ \times \left( \frac{x+\sqrt{h}}{3x+5\sqrt{h}}\right) ^{\frac{1}{2\sqrt{h}} -1}\left( \frac{2x+4\sqrt{h}}{3x+5\sqrt{h}}\right) ^{\frac{1}{\sqrt{h}} -1}\frac{f_X(x)}{nh^{\frac{1}{4}}}+O\left( \frac{h^\frac{1}{4}}{n}\right) , \end{aligned}$$

when \(xh^{-1}\rightarrow \infty \), and

$$\begin{aligned} Cov[A_h(x),A_{4h}(x)]=\frac{R\left( \frac{1}{\sqrt{h}}-1\right) R \left( \frac{1}{2\sqrt{h}}-1\right) }{2\sqrt{\pi }R\left( \frac{3}{2\sqrt{h}} -2\right) (3c\sqrt{h}+5)}\frac{\left( \frac{3}{2}-2\sqrt{h}\right) ^{\frac{3}{2 \sqrt{h}}-\frac{3}{2}}}{(2-2\sqrt{h})^{\frac{1}{\sqrt{h}}-\frac{1}{2}}(1-2 \sqrt{h})^{\frac{1}{2\sqrt{h}}-\frac{1}{2}}}\\ \times \left( \frac{c\sqrt{h}+1}{3c\sqrt{h}+5}\right) ^{\frac{1}{2\sqrt{h}} -1}\left( \frac{2c\sqrt{h}+4}{3c\sqrt{h}+5}\right) ^{\frac{1}{\sqrt{h}}-1} \frac{f_X(x)}{nh^{\frac{3}{4}}}+O\left( \frac{1}{nh^\frac{1}{4}}\right) , \end{aligned}$$

when \(xh^{-1}\rightarrow c>0\).

Proof of Theorem 2.8

It is easy to prove that \([J_h(x)][J_{4h}(x)]^{-1}=1+O(\sqrt{h})\) by using the expansion of \((1+p)^q\). This fact brings us to

$$\begin{aligned} Var[{\widetilde{f}}_X(x)]= & {} Var[2\{1+O(\sqrt{h})\}Y-\{1+O(\sqrt{h})\}^2Z] +Var[O\{(Y+Z)^2\}]\\= & {} Var[2A_h(x)-A_{4h}(x)]+o\left( \frac{1}{nh^{\frac{1}{4}}}\right) \\= & {} 4Var[A_h(x)]+Var[A_{4h}(x)]-4Cov[A_h(x),A_{4h}(x)]+o\left( \frac{1}{nh^{\frac{1}{4}}}\right) . \end{aligned}$$

Last, since the equation above is just a linear combination of two variance formulas, the orders of the variance do not change, which are \(n^{-1}h^{-1/4}\) in the interior and \(n^{-1}h^{-3/4}\) in the boundary region.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fauzi, R.R., Maesono, Y. New type of gamma kernel density estimator. J. Korean Stat. Soc. 49, 882–900 (2020). https://doi.org/10.1007/s42952-019-00040-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42952-019-00040-w

Keywords

Mathematics Subject Classification

Navigation