当前位置: X-MOL 学术J. ACM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes
Journal of the ACM ( IF 2.3 ) Pub Date : 2020-10-06 , DOI: 10.1145/3417994
Hassan Ashtiani 1 , Shai Ben-David 2 , Nicholas J. A. Harvey 3 , Christopher Liaw 3 , Abbas Mehrabian 4 , Yaniv Plan 5
Affiliation  

We introduce a novel technique for distribution learning based on a notion of sample compression . Any class of distributions that allows such a compression scheme can be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. As an application of this technique, we prove that ˜Θ( kd 22 ) samples are necessary and sufficient for learning a mixture of k Gaussians in R d , up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ( kd2 ) samples suffice, matching a known lower bound. Moreover, these results hold in an agnostic learning (or robust estimation) setting, in which the target distribution is only approximately a mixture of Gaussians. Our main upper bound is proven by showing that the class of Gaussians in R d admits a small compression scheme.

中文翻译:

通过压缩方案对高斯混合进行鲁棒学习的近似最优样本复杂度界限

我们介绍了一种基于以下概念的分布式学习新技术样本压缩. 任何允许这种压缩方案的分布类别都可以用很少的样本来学习。此外,如果一类分布有这样的压缩方案,那么产品混合物这些分布。作为该技术的一个应用,我们证明了 ∼Θ(KD 22) 样本对于学习混合ķR中的高斯 d ,直至总变化距离的误差 ε。这改进了该问题的已知上限和下限。对于轴对齐的高斯混合,我们证明了 Õ(KD2) 样本就足够了,匹配已知的下限。此外,这些结果适用于不可知的学习(或稳健估计)设置,其中目标分布仅近似为高斯的混合。我们的主要上界通过证明 R 中的高斯类 d 承认一个小的压缩方案。
更新日期:2020-10-06
down
wechat
bug