Tests and estimation strategies associated to some loss functions,Probability Theory and Related Fields

当前位置： X-MOL 学术 › Probab Theory Relat Fields › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Tests and estimation strategies associated to some loss functions
Probability Theory and Related Fields ( IF 1.5 ) Pub Date : 2021-06-06 , DOI: 10.1007/s00440-021-01065-1
Yannick Baraud

We consider the problem of estimating the joint distribution of n independent random variables. Given a loss function and a family of candidate probabilities, that we shall call a model, we aim at designing an estimator with values in our model that possesses good estimation properties not only when the distribution of the data belongs to the model but also when it lies close enough to it. The losses we have in mind are the total variation, Hellinger, Wasserstein and \({\mathbb {L}}_{p}\)-distances to name a few. We show that the risk of our estimator can be bounded by the sum of an approximation term that accounts for the loss between the true distribution and the model and a complexity term that corresponds to the bound we would get if this distribution did belong to the model. Our results hold under mild assumptions on the true distribution of the data and are based on exponential deviation inequalities that are non-asymptotic and involve explicit constants. Interestingly, when the model reduces to two distinct probabilities, our procedure results in a robust test whose errors of first and second kinds only depend on the losses between the true distribution and the two tested probabilities.

中文翻译：

与一些损失函数相关的测试和估计策略

我们考虑估计n 个独立随机变量的联合分布的问题。给定一个损失函数和一系列候选概率，我们将称之为模型，我们的目标是设计一个具有模型中的值的估计器，该估计器不仅在数据分布属于模型时而且在离它足够近。我们想到的损失是总变异，Hellinger、Wasserstein 和\({\mathbb {L}}_{p}\)-距离仅举几例。我们表明，我们的估计器的风险可以由一个近似项的总和来限制，该近似项解释了真实分布和模型之间的损失，以及一个复杂性项，该复杂项对应于如果该分布确实属于模型，我们将获得的界限. 我们的结果在对数据真实分布的温和假设下成立，并且基于非渐近且涉及显式常数的指数偏差不等式。有趣的是，当模型减少到两个不同的概率时，我们的程序会产生一个稳健的测试，其第一类和第二类错误仅取决于真实分布和两个测试概率之间的损失。

更新日期：2021-06-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11