当前位置: X-MOL 学术Ann. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Estimating minimum effect with outlier selection
Annals of Statistics ( IF 4.5 ) Pub Date : 2021-01-29 , DOI: 10.1214/20-aos1956
Alexandra Carpentier , Sylvain Delattre , Etienne Roquain , Nicolas Verzelen

We introduce one-sided versions of Huber’s contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation of the mean of the uncorrupted samples (minimum effect) and selection of the corrupted samples (outliers). Regarding estimation of the minimum effect, we derive the minimax risks and introduce estimators that are adaptive with respect to the unknown number of contaminations. The optimal convergence rates differ from the ones in the classical Huber contamination model. This fact uncovers the effect of the one-sided structural assumption of the contaminations. As for the problem of selecting the outliers, we formulate the problem in a multiple testing framework for which the location and scaling of the null hypotheses are unknown. We rigorously prove that estimating the null hypothesis while maintaining a theoretical guarantee on the amount of the falsely selected outliers is possible, both through false discovery rate (FDR) and through post hoc bounds. As a by-product, we address a long-standing open issue on FDR control under equi-correlation, which reinforces the interest of removing dependency in such a setting.

中文翻译:

用离群值估计最小效应

我们介绍了Huber污染模型的一种单面版本,其中损坏的样本往往比未损坏的样本具有更大的值。解决了两个相互交织的问题:未损坏样本的均值估计(最小影响)和已损坏样本的选择(离群值)。关于最小效应的估计,我们得出最小最大风险,并引入针对未知污染数量的自适应估计器。最佳收敛速度不同于经典Huber污染模型中的收敛速度。这一事实揭示了污染的单方面结构假设的影响。至于选择离群值的问题,我们在多重测试框架中表述了该问题,对于该框架,原假设的位置和定标是未知的。我们严格证明,通过虚假发现率(FDR)和事后边界,可以在估计虚假假设的同时维持对错误选择的异常值数量的理论保证。作为副产品,我们解决了长期在等相关下进行FDR控制的公开问题,这增强了在这种情况下消除依赖的兴趣。
更新日期:2021-01-29
down
wechat
bug