当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Differentially private cross-silo federated learning
arXiv - CS - Cryptography and Security Pub Date : 2020-07-10 , DOI: arxiv-2007.05553
Mikko A. Heikkil\"a, Antti Koskela, Kana Shimizu, Samuel Kaski, Antti Honkela

Strict privacy is of paramount importance in distributed machine learning. Federated learning, with the main idea of communicating only what is needed for learning, has been recently introduced as a general approach for distributed learning to enhance learning and improve security. However, federated learning by itself does not guarantee any privacy for data subjects. To quantify and control how much privacy is compromised in the worst-case, we can use differential privacy. In this paper we combine additively homomorphic secure summation protocols with differential privacy in the so-called cross-silo federated learning setting. The goal is to learn complex models like neural networks while guaranteeing strict privacy for the individual data subjects. We demonstrate that our proposed solutions give prediction accuracy that is comparable to the non-distributed setting, and are fast enough to enable learning models with millions of parameters in a reasonable time. To enable learning under strict privacy guarantees that need privacy amplification by subsampling, we present a general algorithm for oblivious distributed subsampling. However, we also argue that when malicious parties are present, a simple approach using distributed Poisson subsampling gives better privacy. Finally, we show that by leveraging random projections we can further scale-up our approach to larger models while suffering only a modest performance loss.

中文翻译:

差异化私有跨孤岛联邦学习

严格的隐私在分布式机器学习中至关重要。联邦学习的主要思想是只交流学习所需的内容,最近被引入作为分布式学习的通用方法,以增强学习和提高安全性。但是,联邦学习本身并不能保证数据主体的任何隐私。为了量化和控制在最坏情况下有多少隐私受到损害,我们可以使用差分隐私。在本文中,我们在所谓的跨孤岛联合学习设置中将加法同态安全求和协议与差分隐私相结合。目标是学习神经网络等复杂模型,同时保证单个数据主体的严格隐私。我们证明我们提出的解决方案提供了与非分布式设置相当的预测精度,并且足够快,可以在合理的时间内启用具有数百万个参数的学习模型。为了在需要通过子采样进行隐私放大的严格隐私保证下进行学习,我们提出了一种用于不经意分布式子采样的通用算法。然而,我们也认为,当存在恶意方时,使用分布式泊松子采样的简单方法可以提供更好的隐私。最后,我们表明,通过利用随机预测,我们可以进一步将我们的方法扩展到更大的模型,同时只遭受适度的性能损失。为了在需要通过子采样进行隐私放大的严格隐私保证下进行学习,我们提出了一种用于不经意分布式子采样的通用算法。然而,我们也认为,当存在恶意方时,使用分布式泊松子采样的简单方法可以提供更好的隐私。最后,我们表明,通过利用随机预测,我们可以进一步将我们的方法扩展到更大的模型,同时只遭受适度的性能损失。为了在需要通过子采样进行隐私放大的严格隐私保证下进行学习,我们提出了一种用于不经意分布式子采样的通用算法。然而,我们也认为,当存在恶意方时,使用分布式泊松子采样的简单方法可以提供更好的隐私。最后,我们表明,通过利用随机预测,我们可以进一步将我们的方法扩展到更大的模型,同时只遭受适度的性能损失。
更新日期:2020-07-14
down
wechat
bug