Abstract
We discuss a new kernel type estimator for density function \(f_X(x)\) with nonnegative support. Here, we use a type of gamma density as a kernel function and modify it with expansions of exponential and logarithmic functions. Our modified gamma kernel density estimator is not only free of the boundary bias, but the variance is also in smaller orders, which are \(O(n^{-1}h^{-1/4})\) in the interior and \(O(n^{-1}h^{-3/4})\) in the boundary region. Furthermore, the optimal orders of its mean squared error are \(O(n^{-8/9})\) in the interior and \(O(n^{-8/11})\) in the boundary region. Simulation results that demonstrate the proposed method’s performances are also presented.
Similar content being viewed by others
References
Abramson, I. (1982). On bandwidth variation in kernel estimates—a square root law. Annals of Statistics, 10, 1217–1223. https://doi.org/10.1214/aos/1176345986.
Bouezmarni, T., & Scaillet, O. (2005). Consistency of asymmetric kernel density estimators and smoothed histograms with application to income data. Econometric Theory, 21, 390–412. https://doi.org/10.1017/S0266466605050218.
Brown, B., & Chen, S. (1999). Beta-bernstein smoothing for regression curves with compact supports. Scandinavian Journal of Statistics, 26, 47–59. https://doi.org/10.1111/1467-9469.00136.
Chen, S. (2000). Probability density function estimation using gamma kernels. Annals of the Institute of Staistical Mathematics, 52, 471–480. https://doi.org/10.1023/a:1004165218295.
Cheng, M., Fan, J., & Marron, J. (1997). On automatic boundary corrections. Annals of Statistics, 25, 1691–1708. https://doi.org/10.1214/aos/1031594737.
Cowling, A., & Hall, P. (1996). On pseudodata methods for removing boundary effects in kernel density estimation. Journal of the Royal Statistical Society B, 58, 551–563. https://doi.org/10.2307/2345893.
Hall, P., & Wehrly, T. (1991). A geometrical method for removing edge effects from kernel-type nonparametric regression estimators. Journal of the American Statistical Association, 86, 665–672. https://doi.org/10.1080/01621459.1991.10475092.
Igarashi, G., & Kakizawa, Y. (2015). Bias corrections for some asymmetric kernel estimators. Journal of Statistical Planning and Inference, 159, 37–63. https://doi.org/10.1016/j.jspi.2014.11.003.
Jones, M. (1993). Simple boundary correction for kernel density estimation. Statistics and Computing, 3, 135–146. https://doi.org/10.1007/bf00147776.
Jones, M., & Foster, P. (1996). A simple nonnegative boundary correction method for kernel density estimation. Statistica Sinica, 6, 1005–1013.
Jones, M., Linton, O., & Nielsen, J. (1995). A simple bias reduction method for density estimation. Biometrika, 82, 327–338. https://doi.org/10.1093/biomet/82.2.327.
Lejeune, M., & Sarda, P. (1992). Smooth estimators of distribution and density functions. Computational Statistics and Data Analysis, 14, 457–471. https://doi.org/10.1016/0167-9473(92)90061-j.
Marron, J., & Ruppert, D. (1994). Transformation to reduce boundary bias in kernel density estimation. Journal of the Royal Statistical Society B, 56, 653–671.
Muller, H. (1991). Smooth optimum kernel estimators near endpoints. Biometrika, 78, 521–530. https://doi.org/10.1093/biomet/78.3.521.
Muller, H. (1993). On the boundary kernel method for nonparametric curve estimation near endpoints. Scandinavian Journal of Statistics, 20, 313–328.
Muller, H., & Wang, J. (1994). Hazard rate estimation under random censoring with varying kernels and bandwidths. Biometrics, 50, 61–76. https://doi.org/10.2307/2533197.
Parzen, E. (1962). On estimation of a probability density function and mode. Annals of Mathematical Statistics, 32, 1065–1076. https://doi.org/10.1214/aoms/1177704472.
Rosenblatt, M. (1956). Remarks on some non-prametric estimates of a density function. Annals of Mathematical Statistics, 27, 832–837. https://doi.org/10.1214/aoms/1177728190.
Ruppert, D., & Cline, D. (1994). Bias reduction in kernel density estimation by smoothed empirical transformations. Annals of Statistics, 22, 185–210. https://doi.org/10.1214/aos/1176325365.
Samiuddin, M., & El-Sayyad, G. (1990). On nonparametric kernel density estimates. Biometrika, 77, 865–874. https://doi.org/10.1093/biomet/77.4.865.
Schuster, E. (1985). Incorporating support constraints into nonparametric estimators of densities. Communication in Statistics-Theory and Methods, 14, 1123–1136. https://doi.org/10.1080/03610928508828965.
Terrel, G., & Scott, D. (1980). On improving convergence rates for non-negative kernel density estimation. Annals of Statistics, 8, 1160–1163. https://doi.org/10.1214/aos/1176345153.
Zhang, S. (2010). A note on the performance of the gamma kernel estimators at the boundary. Statistics and Probability Letters, 80, 548–557. https://doi.org/10.1016/j.spl.2009.12.009.
Acknowledgements
The authors would like to thank the two anonymous referees and the editor-in-chief for their careful reading and valuable comments, which improved the manuscript.
Funding
This work was supported by JSPS Grant-in-Aid for Scientific Research (B) [Grant number 16H02790].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proof of Theorem 2.2
First, by usual reasoning of i.i.d. random variables, we have
If we define a random variable \(W\sim Gamma(h^{-1/2},x\sqrt{h}+h)\) with mean
\(Var(W)=h^{-1/2}(x\sqrt{h}+h)^2\), and \(E[(W-\mu _W)^3]=2h^{-1/2}(x\sqrt{h}+h)^3\), we can see the integral as an expectation of \(f_X(W)\), and we are then able to use Taylor expansion twice, first around \(\mu _W\), and next around x. This results in
Hence, we have
which is in the order of \(\sqrt{h}\).
Next, we derive the formula of the variance, which is
First, we take a look at the expectation part,
where V is a \(Gamma(2h^{-1/2}-1,(x\sqrt{h}+h)/2)\) random variable, B(x, h) is a factor outside the integral, and the integral itself can be considered as \(E[f_X(V)]\). Similar as before, the random variable V has mean \(\mu _V=(2h^{-1/2}-1)(x\sqrt{h}+h/2)\) and
In the same fashion as in \(E[f_X(W)]\) before, we have
Now, let \(R(z)=\frac{\sqrt{2\pi }z^{z+\frac{1}{2}}}{e^z\Gamma (z+1)}\); then, B(x, h) can be rewritten to become
Thus, we obtain Eq. 13, and the proof is completed.
Proof of Theorem 2.4
We have already expanded \(J_h(x)\) until the \(\sqrt{h}\) term. Now, extending it until the h term results in
where \(a(x)=f_X'(x)+\frac{1}{2}x^2f_X''(x)\), and \(b(x)=\left( x+\frac{1}{2}\right) f_X''(x)+x^2\left( \frac{x}{3}+\frac{1}{2} \right) f_X'''(x)\). By taking the natural logarithm and using its expansion, we have
Next, if we define \(J_{4h}(x)=E[A_{4h}(x)]\) (using quadrupled bandwidth), i.e.,
we can set up conditions to eliminate the term \(\sqrt{h}\) while keeping the term \(\ln f_X(x)\). Now, since \(\ln [J_h(x)]^{t_1}[J_{4h}(x)]^{t_2}\) equals
the conditions we need are \(t_1+t_2=1\) and \(t_1+2t_2=0\). It is obvious that the solution is \(t_1=2\) and \(t_2=-1\), and we get
If we take the exponential function and use its expansion, we have
Proof of Theorem 2.6
Because of the definition of \(J_h(x)\) and \(J_{4h}(x)\), we can rewrite \(A_h(x)=J_h(x)+Y\) and \(A_{4h}(x)=J_{4h}(x)+Z\), where Y and Z are random variables with E(Y) and E(Z) are both 0, \(Var(Y)=Var[A_h(x)]\), and \(Var(Z)=Var[A_{4h}(x)]\). Then, by the expansion \((1+p)^q=1+pq+O(p^2)\), we get
Hence,
and its bias is
Proof of Theorem 2.7
By usual calculation of i.i.d. random variables, we have
Now, for the expectation,
where C(x, h) is the factor outside the integral, and T is a random variable with mean
and variance \(Var(T)=O(\sqrt{h})\). Utilizing Taylor expansion results in
Using the definition of R(z) as before, we get
when \(x>h\) (for \(x\le h\), the calculation is similar). Hence, the covariance term is
when \(xh^{-1}\rightarrow \infty \), and
when \(xh^{-1}\rightarrow c>0\).
Proof of Theorem 2.8
It is easy to prove that \([J_h(x)][J_{4h}(x)]^{-1}=1+O(\sqrt{h})\) by using the expansion of \((1+p)^q\). This fact brings us to
Last, since the equation above is just a linear combination of two variance formulas, the orders of the variance do not change, which are \(n^{-1}h^{-1/4}\) in the interior and \(n^{-1}h^{-3/4}\) in the boundary region.
Rights and permissions
About this article
Cite this article
Fauzi, R.R., Maesono, Y. New type of gamma kernel density estimator. J. Korean Stat. Soc. 49, 882–900 (2020). https://doi.org/10.1007/s42952-019-00040-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-019-00040-w
Keywords
- Convergence rate
- Density function
- Exponential expansion
- Gamma density
- Kernel method
- Logarithmic expansion
- Nonparametric
- Variance reduction