Secure and Differentially Private Logistic Regression for Horizontally Distributed Data,IEEE Transactions on Information Forensics and Security

当前位置： X-MOL 学术 › IEEE Trans. Inform. Forensics Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Secure and Differentially Private Logistic Regression for Horizontally Distributed Data
IEEE Transactions on Information Forensics and Security ( IF 6.8 ) Pub Date : 2019-06-27 , DOI: 10.1109/tifs.2019.2925496
Miran Kim , Junghye Lee , Lucila Ohno-Machado , Xiaoqian Jiang

Scientific collaborations benefit from sharing information and data from distributed sources, but protecting privacy is a major concern. Researchers, funders, and the public in general are getting increasingly worried about the potential leakage of private data. Advanced security methods have been developed to protect the storage and computation of sensitive data in a distributed setting. However, they do not protect against information leakage from the outcomes of data analyses. To address this aspect, studies on differential privacy (a state-of-the-art privacy protection framework) demonstrated encouraging results, but most of them do not apply to distributed scenarios. Combining security and privacy methodologies is a natural way to tackle the problem, but naive solutions may lead to poor analytical performance. In this paper, we introduce a novel strategy that combines differential privacy methods and homomorphic encryption techniques to achieve the best of both worlds. Using logistic regression (a popular model in biomedicine), we demonstrated the practicability of building secure and privacy-preserving models with high efficiency (less than 3 min) and good accuracy [<;1% of difference in the area under the receiver operating characteristic curve (AUC) against the global model] using a few real-world datasets.

中文翻译：

水平分布数据的安全且差分私有Logistic回归

科学协作受益于共享来自分布式源的信息和数据，但是保护隐私是一个主要问题。研究人员，资助者和公众普遍越来越担心私人数据的潜在泄漏。已经开发出高级安全方法来保护分布式设置中敏感数据的存储和计算。但是，它们不能防止数据分析结果泄漏信息。为了解决此问题，对差异隐私（最新的隐私保护框架）的研究显示出令人鼓舞的结果，但其中大多数不适用于分布式方案。将安全和隐私方法相结合是解决问题的一种自然方法，但是幼稚的解决方案可能会导致不良的分析性能。在本文中，我们介绍了一种新颖的策略，该策略结合了差分隐私方法和同态加密技术，以实现两全其美。使用logistic回归（生物医学中流行的模型），我们证明了建立高效（少于3分钟）且具有良好准确性的安全和隐私保护模型的实用性[<;接收器工作特性下面积的差异的1％相对于全局模型的曲线（AUC）]。

更新日期：2020-04-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>