当前位置: X-MOL 学术IEEE Trans. Inform. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Private Hypothesis Selection
IEEE Transactions on Information Theory ( IF 2.2 ) Pub Date : 2021-01-08 , DOI: 10.1109/tit.2021.3049802
Mark Bun , Gautam Kamath , Thomas Steinke , Zhiwei Steven Wu

We provide a differentially private algorithm for hypothesis selection. Given samples from an unknown probability distribution P and a set of m probability distributions $\mathcal {H}$ , the goal is to output, in a $\varepsilon $ -differentially private manner, a distribution from $\mathcal {H}$ whose total variation distance to P is comparable to that of the best such distribution (which we denote by $\alpha $ ). The sample complexity of our basic algorithm is $\text {O}\left ({\frac {\log \text {m}}{\alpha ^{2}} + \frac {\log \text {m}}{\alpha \varepsilon }}\right)$ , representing a minimal cost for privacy when compared to the non-private algorithm. We also can handle infinite hypothesis classes $\mathcal {H}$ by relaxing to $(\varepsilon,\delta)$ -differential privacy. We apply our hypothesis selection algorithm to give learning algorithms for a number of natural distribution classes, including Gaussians, product distributions, sums of independent random variables, piecewise polynomials, and mixture classes. Our hypothesis selection procedure allows us to generically convert a cover for a class to a learning algorithm, complementing known learning lower bounds which are in terms of the size of the packing number of the class. As the covering and packing numbers are often closely related, for constant $\alpha $ , our algorithms achieve the optimal sample complexity for many classes of interest. Finally, we describe an application to private distribution-free PAC learning.

中文翻译:

私人假设选择

我们提供了一种用于假设选择的差分私有算法。给定样本来自未知概率分布P和一组m概率分布 $ \数学{H} $ ,目标是在 $ \ varepsilon $ -以不同的私人方式,来自 $ \数学{H} $ 其与P的总变化距离可与最佳此类分布的变化距离相媲美(我们用 $ \ alpha $ )。我们基本算法的样本复杂度是 $ \ text {O} \ left({\ frac {\ log \ text {m}} {\ alpha ^ {2}} + \ frac {\ log \ text {m}} {\ alpha \ varepsilon}} \ right )$ ,与非私有算法相比,代表了最低的隐私保护成本。我们还可以处理无限假设类 $ \数学{H} $ 通过放松 $(\ varepsilon,\ delta)$ -差异性隐私。我们应用假设选择算法为多种自然分布类别(包括高斯分布,乘积分布,独立随机变量之和,分段多项式和混合类别)提供学习算法。我们的假设选择程序使我们能够将班级的学历转换为学习算法,并根据班级的装箱数来补充已知的学习下限。由于覆盖数量和包装数量通常紧密相关,因此 $ \ alpha $ ,我们的算法可针对许多感兴趣的类别实现最佳的样本复杂度。最后,我们描述了一种适用于无私人发行的PAC学习的应用程序。
更新日期:2021-02-19
down
wechat
bug