当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Differentially Private Simple Linear Regression
arXiv - CS - Cryptography and Security Pub Date : 2020-07-10 , DOI: arxiv-2007.05157
Daniel Alabi, Audra McMillan, Jayshree Sarathy, Adam Smith and Salil Vadhan

Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal sensitive individual information. We study algorithms for simple linear regression that satisfy differential privacy, a constraint which guarantees that an algorithm's output reveals little about any individual input data record, even to an attacker with arbitrary side information about the dataset. We consider the design of differentially private algorithms for simple linear regression for small datasets, with tens to hundreds of datapoints, which is a particularly challenging regime for differential privacy. Focusing on a particular application to small-area analysis in economics research, we study the performance of a spectrum of algorithms we adapt to the setting. We identify key factors that affect their performance, showing through a range of experiments that algorithms based on robust estimators (in particular, the Theil-Sen estimator) perform well on the smallest datasets, but that other more standard algorithms do better as the dataset size increases.

中文翻译:

差分私有简单线性回归

经济学和社会科学研究通常需要以细粒度分析敏感个人信息的数据集,模型适合数据的小子集。不幸的是,这种细粒度的分析很容易揭示敏感的个人信息。我们研究了满足差分隐私的简单线性回归算法,这是一种约束,可保证算法的输出几乎不会透露任何单个输入数据记录,即使对具有数据集的任意边信息的攻击者也是如此。我们考虑为具有数十到数百个数据点的小数据集的简单线性回归设计差分隐私算法,这对于差分隐私来说是一个特别具有挑战性的制度。专注于经济学研究中小范围分析的特定应用,我们研究了我们适应环境的一系列算法的性能。我们确定了影响其性能的关键因素,通过一系列实验表明,基于稳健估计器(特别是 Theil-Sen 估计器)的算法在最小数据集上表现良好,但其他更标准的算法在数据集大小时表现更好增加。
更新日期:2020-07-13
down
wechat
bug