Abstract
One-sided cross-validation (OSCV) is a bandwidth selection method initially introduced by Hart and Yi (J Am Stat Assoc 93(442):620–631, 1998) in the context of smooth regression functions. Martínez-Miranda et al. (in Gregoriou (ed) Operational risk towards basel III: best practices and issues in modeling, management and regulation, Wiley, Hoboken, 2009) developed a version of OSCV for smooth density functions. This article extends the method for nonsmooth densities. It also introduces the fully robust OSCV modification that produces consistent OSCV bandwidths for both smooth and nonsmooth cases. Practical implementations of the OSCV method for smooth and nonsmooth densities are discussed. One of the considered cross-validation kernels has potential for improving the OSCV method’s performance in the regression context.
Similar content being viewed by others
References
Bowman AW (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2):353–360
Chiu S-T (1991) The effect of discretization error on bandwidth selection for kernel density estimation. Biometrika 78(2):436–441
Cline DBH, Hart JD (1991) Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics 22(1):69–84
Gámiz Pérez ML, Janys L, Martínez Miranda MD, Nielsen JP (2013a) Bandwidth selection in marker dependent kernel hazard estimation. Comput Stat Data Anal 68:155–169
Gámiz Pérez ML, Martínez Miranda MD, Nielsen JP (2013b) Smoothing survival densities in practice. Comput Stat Data Anal 58:368–382
Gámiz ML, Mammen E, Martínez Miranda MD, Nielsen JP (2016) Double one-sided cross-validation of local linear hazards. J R Stat Soc Ser B Stat Methodol 78(4):755–779
Härdle W (1991) Smoothing techniques. Springer series in statistics. Springer, New York (with implementation in S)
Hart JD, Yi S (1998) One-sided cross-validation. J Am Stat Assoc 93(442):620–631
Jones MC, Marron JS, Sheather SJ (1996) A brief survey of bandwidth selection for density estimation. J Am Stat Assoc 91(433):401–407
Köhler M, Schindler A, Sperlich S (2014) A review and comparison of bandwidth selection methods for kernel regression. Int Stat Rev 82(2):243–274. https://doi.org/10.1111/insr.12039
Loader CR (1999) Bandwidth selection: classical or plug-in? Ann Stat 27(2):415–438
Mammen E, Martínez Miranda MD, Nielsen JP, Sperlich S (2011) Do-validation for kernel density estimation. J Am Stat Assoc 106(494):651–660. https://doi.org/10.1198/jasa.2011.tm08687
Mammen E, Martínez Miranda MD, Nielsen JP, Sperlich S (2014) Further theoretical and practical insight to the do-validated bandwidth selector. J Korean Stat Soc 43(3):355–365
Martínez-Miranda MD, Nielsen JP, Sperlich S (2009) One sided cross validation for density estimation. In: Gregoriou GN (ed) Operational risk towards basel III: best practices and issues in modeling, management and regulation. Wiley, Hoboken, pp 177–196
Rudemo M (1982) Empirical choice of histograms and kernel density estimators. Scand J Stat 9(2):65–78
Savchuk O (2017a) ICV: Indirect Cross-Validation (ICV) for Kernel Density Estimation. R package version 1.0
Savchuk O (2017b) OSCV: One-Sided Cross-Validation. R package version 1.0
Savchuk OY, Hart JD (2017) Fully robust one-sided cross-validation for regression functions. Comput Stat. https://doi.org/10.1007/s00180-017-0713-7
Savchuk OY, Hart JD, Sheather SJ (2010) Indirect cross-validation for density estimation. J Am Stat Assoc 105(489):415–423
Savchuk O, Hart J, Sheather S (2011) An empirical study of indirect cross-validation. In: Hunter D, Rosenberge J, Richards D (eds) Nonparametric statistics and mixture models. World Scientific Publishing, Hackensack, NJ, pp 288–308. https://doi.org/10.1142/9789814340564_0017
Savchuk OY, Hart JD, Sheather SJ (2013) One-sided cross-validation for nonsmooth regression functions. J Nonparametr Stat 25(4):889–904
Savchuk OY, Hart JD, Sheather SJ (2016) Corrigendum to “One-sided cross-validation for nonsmooth regression functions”. [J. Nonparametr. Stat., 25(4): 889–904, 2013]. J Nonparametr Stat 28(4):875–877
Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J R Stat Soc Ser B 53(3):683–690
Silverman BW (1986) Density estimation for statistics and data analysis. Monographs on statistics and applied probability. Chapman & Hall, London
Tenreiro C (2017) A weighted least-squares cross-validation bandwidth selector for kernel density estimation. Commun Stat Theory Methods 46(7):3438–3458. https://doi.org/10.1080/03610926.2015.1062108
van Eeden C (1985) Mean integrated squared error of kernel estimators when the density and its derivative are not necessarily continuous. Ann Inst Stat Math 37(3):461–472
van Es B (1992) Asymptotics for least squares cross-validation bandwidths in nonsmooth cases. Ann Stat 20(3):1647–1657
Wand MP, Jones MC (1995) Kernel smoothing. Volume 60 of monographs on statistics and applied probability. Chapman and Hall Ltd., London
Yi S (1996) On one-sided cross-validation in nonparametric regression. Ph.D. dissertation, Texas A&M University
Acknowledgements
The author appreciate the Associate Editor and referees’ comments, especially the idea of extending the OSCV method to a nonsmooth case where a density has finitely many simple discontinuities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Notation
For an arbitrary function g, define the following functionals:
Based on \(D_g\) and \(G_g\) we define
The theoretical results in this article involve B(K) and B(L) for the two-sided and one-sided kernels K and L, respectively. In the case when K is symmetrical, one may derive from (12) that
By taking into account that L is supported on \([0,\infty )\), it appears that
1.2 Identity of the left-sided and right-sided OSCV functions in the case of a symmetrical generating kernel H
As we noted above, \(K_R(u)=K_L(-u)\) in the case when H is symmetrical.
Theorem
In the case when H is symmetrical, \(\text{ OSCV }_{K_L}(b)=\text{ OSCV }_{K_R}(b)\) for any \(b>0\). This implies \({{\hat{h}}}_{OSCV,K_L}={{\hat{h}}}_{OSCV,K_R}\).
Proof
For any \(b>0\), consider
After the change of variables \(u=\displaystyle {\frac{x-\frac{X_i+X_j}{2}}{b}}\), we get
Next, consider
This finishes the proof that \(\text{ OSCV }_{K_L}(b)=\text{ OSCV }_{K_R}(b)\) for any \(b>0\). This implies equality \({{\hat{h}}}_{OSCV,K_L}={{\hat{h}}}_{OSCV,K_R}\). \(\square \)
The above proof can be easily adjusted for the case of a slightly differently defined OSCV functions used in Mammen et al. (2011, 2014).
Rights and permissions
About this article
Cite this article
Savchuk, O.Y. One-sided cross-validation for nonsmooth density functions. Comput Stat 35, 1253–1272 (2020). https://doi.org/10.1007/s00180-019-00938-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-019-00938-3