当前位置: X-MOL 学术Algorithmica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Privately with Labeled and Unlabeled Examples
Algorithmica ( IF 0.9 ) Pub Date : 2020-08-03 , DOI: 10.1007/s00453-020-00753-z
Amos Beimel , Kobbi Nissim , Uri Stemmer

A private learner is an algorithm that given a sample of labeled individual examples outputs a generalizing hypothesis while preserving the privacy of each individual. In 2008, Kasiviswanathan et al. (FOCS 2008) gave a generic construction of private learners, in which the sample complexity is (generally) higher than what is needed for non-private learners. This gap in the sample complexity was then further studied in several followup papers, showing that (at least in some cases) this gap is unavoidable. Moreover, those papers considered ways to overcome the gap, by relaxing either the privacy or the learning guarantees of the learner. We suggest an alternative approach, inspired by the (non-private) models of semi-supervised learning and active-learning, where the focus is on the sample complexity of labeled examples whereas unlabeled examples are of a significantly lower cost. We consider private semi-supervised learners that operate on a random sample, where only a (hopefully small) portion of this sample is labeled. The learners have no control over which of the sample elements are labeled. Our main result is that the labeled sample complexity of private learners is characterized by the VC dimension. We present two generic constructions of private semi-supervised learners. The first construction is of learners where the labeled sample complexity is proportional to the VC dimension of the concept class, however, the unlabeled sample complexity of the algorithm is as big as the representation length of domain elements. Our second construction presents a new technique for decreasing the labeled sample complexity of a given private learner, while roughly maintaining its unlabeled sample complexity. In addition, we show that in some settings the labeled sample complexity does not depend on the privacy parameters of the learner.

中文翻译:

使用标记和未标记的示例私下学习

私人学习器是一种算法,它给定一个标记的个体示例样本,输出一个泛化假设,同时保护每个个体的隐私。2008 年,Kasiviswanathan 等人。(FOCS 2008) 给出了私人学习者的通用构造,其中样本复杂性(通常)高于非私人学习者所需的复杂度。然后在几篇后续论文中进一步研究了样本复杂性中的这种差距,表明(至少在某些情况下)这种差距是不可避免的。此外,这些论文考虑了通过放松学习者的隐私或学习保证来克服差距的方法。我们提出了一种替代方法,受到半监督学习和主动学习的(非私有)模型的启发,其中重点是标记示例的样本复杂性,而未标记示例的成本要低得多。我们考虑对随机样本进行操作的私有半监督学习器,其中仅标记了该样本的(希望很小)一部分。学习者无法控制标记哪些样本元素。我们的主要结果是私人学习者的标记样本复杂性以 VC 维度为特征。我们提出了私人半监督学习器的两种通用结构。第一种构造是学习器,其中标记样本复杂度与概念类的 VC 维数成正比,然而,算法的未标记样本复杂度与域元素的表示长度一样大。我们的第二个构造提出了一种新技术,用于降低给定私有学习器的标记样本复杂性,同时大致保持其未标记样本的复杂性。此外,我们表明,在某些设置中,标记样本的复杂性不依赖于学习者的隐私参数。
更新日期:2020-08-03
down
wechat
bug