Abstract
We consider the problem of estimating the joint distribution of n independent random variables. Given a loss function and a family of candidate probabilities, that we shall call a model, we aim at designing an estimator with values in our model that possesses good estimation properties not only when the distribution of the data belongs to the model but also when it lies close enough to it. The losses we have in mind are the total variation, Hellinger, Wasserstein and \({\mathbb {L}}_{p}\)-distances to name a few. We show that the risk of our estimator can be bounded by the sum of an approximation term that accounts for the loss between the true distribution and the model and a complexity term that corresponds to the bound we would get if this distribution did belong to the model. Our results hold under mild assumptions on the true distribution of the data and are based on exponential deviation inequalities that are non-asymptotic and involve explicit constants. Interestingly, when the model reduces to two distinct probabilities, our procedure results in a robust test whose errors of first and second kinds only depend on the losses between the true distribution and the two tested probabilities.
Similar content being viewed by others
References
Baraud, Y.: Estimator selection with respect to Hellinger-type risks. Probab. Theory Relat. Fields 151(1–2), 353–401 (2011)
Baraud, Y.: Bounding the expectation of the supremum of an empirical process over a (weak) VC-major class. Electron. J. Stat. 10(2), 1709–1728 (2016)
Baraud, Y.: Tests and estimation strategies associated to some loss functions (2021). arXiv:2003.12544
Baraud, Y., Birgé, L.: Rho-estimators for shape restricted density estimation. Stoch. Process. Appl. 126(12), 3888–3912 (2016)
Baraud, Y., Birgé, L.: Rho-estimators revisited: general theory and applications. Ann. Stat. 46(6B), 3767–3804 (2018)
Baraud, Y., Birgé, L., Sart, M.: A new method for estimation and model selection: \(\rho \)-estimation. Invent. Math. 207(2), 425–517 (2017)
Birgé, L.: Approximation dans les espaces métriques et théorie de l’estimation. Z. Wahrsch. Verw. Gebiete 65(2), 181–237 (1983)
Birgé, L.: Sur un théorème de minimax et son application aux tests. Probab. Math. Stat. 3(2), 259–282 (1984)
Birgé, L.: On the risk of histograms for estimating decreasing densities. Ann. Stat. 15(3), 1013–1022 (1987)
Birgé, L.: Model selection via testing: an alternative to (penalized) maximum likelihood estimators. Ann. Inst. H. Poincaré Probab. Stat. 42(3), 273–325 (2006)
Birgé, L.: Robust tests for model selection. In: Banerjee, M., Bunea, F., Huang, J., Koltchinskii, V., Maathuis, M.H. (eds) From Probability to Statistics and Back: High-Dimensional Models and Processes, vol. 9, pp. 47–64. IMS Collections (2013)
Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4(3), 329–375 (1998)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities. Oxford University Press, Oxford (2013)
Devroye, L., Lugosi, G.: Combinatorial Methods in Density Estimation. Springer Series in Statistics, Springer, New York (2001)
Dudley, R. M.: A course on empirical processes. In: École d’été de Probabilités de Saint-Flour, XII—1982, Volume 1097 of Lecture Notes in Mathematics, pp. 1–142. Springer, Berlin (1984)
Gao, C., Liu, J., Yao, Y., Zhu, W.: Robust estimation via generative adversarial networks. In: International Conference on Learning (2019). (Representations)
Giné, E., Nickl, R.: Mathematical foundations of infinite-dimensional statistical models. In: Cambridge Series in Statistical and Probabilistic Mathematics, [40]. Cambridge University Press, New York (2016)
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks (2014). arXiv:1406.2661
Grenander, U.: Abstract inference. In: Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1981)
Groeneboom, P.: Estimating a monotone density. In: Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983), Wadsworth Statistics/Probability Series, pp. 539–555. Wadsworth, Belmont, CA (1985)
Huber, P.J.: A robust version of the probability ratio test. Ann. Math. Stat. 36, 1753–1758 (1965)
Koltchinskii, V.: Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Stat. 34(6), 2593–2656 (2006)
Le Cam, L.: Convergence of estimates under dimensionality restrictions. Ann. Stat. 1, 38–53 (1973)
Le Cam, L.: Asymptotic Methods in Statistical Decision Theory. Springer Series in Statistics, Springer, New York (1986)
Massart, P.: Concentration Inequalities and Model Selection, Volume 1896 of Lecture Notes in Mathematics. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003 (2007)
Meyer, Y.: Wavelets and Operators, Volume 37 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge. Translated from the 1990 French original by D. H. Salinger (1992)
Reynaud-Bouret, P., Rivoirard, V.: Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Stat. 4, 172–238 (2010)
Reynaud-Bouret, P., Rivoirard, V., Tuleau-Malot, C.: Adaptive density estimation: a curse of support? J. Stat. Plann. Inference 141(1), 115–139 (2011)
Shorack, G.R., Wellner, J.A.: Empirical processes with applications to statistics. In: Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York (1986)
Villani, C.: Optimal Transport, Volume 338 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin. Old and New (2009)
Yatracos, Y.G.: Rates of convergence of minimum distance estimators and Kolmogorov’s entropy. Ann. Stat. 13(2), 768–774 (1985)
Acknowledgements
The author would like to thank the two referees as well as Lucien Birgé for their many questions and comments which helped to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 811017.
Rights and permissions
About this article
Cite this article
Baraud, Y. Tests and estimation strategies associated to some loss functions. Probab. Theory Relat. Fields 180, 799–846 (2021). https://doi.org/10.1007/s00440-021-01065-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-021-01065-1
Keywords
- Density estimation
- Parametric estimation
- Robust estimation
- Wasserstein loss
- Total variation loss
- \({\mathbb {L}}_{p}\)-loss
- Minimax theory
- Robust testing
- GAN