Privacy-preserving mechanism for mixed data clustering with local differential privacy,Concurrency and Computation: Practice and Experience

当前位置： X-MOL 学术 › Concurr. Comput. Pract. Exp. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Privacy-preserving mechanism for mixed data clustering with local differential privacy
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2021-07-16 , DOI: 10.1002/cpe.6503
Liujie Yuan ₁ , Shaobo Zhang ₁ , Gengming Zhu ₁ , Karim Alinani ₁

Affiliation

In big data mining, the K-prototypes has become a popular clustering method for mixed data owing to its simplicity and efficiency. However, the data clustering process of the K-prototypes method will cause the risk of user privacy leakage because user data usually contain sensitive information. To address this issue, general solutions introduce a trusted third-party model for privacy protection in clustering analysis, but it is difficult to find a fully trusted entity in reality. In this article, we propose a local differential privacy K-prototypes (LDPK) mechanism, which does not require any trusted third party to perform privacy preprocessing on user data. Our mechanism first uses local differential privacy to disturb user data, then completes the clustering through the interaction between server and user. Furthermore, we propose a privacy protection enhancement mechanism (ELDPK) by extending the LDPK mechanism, which disturbs the user's clustering information in each iteration to protect user privacy further. Theoretical analysis proves the privacy and feasibility of our proposed scheme, and the experimental results prove that our proposed scheme can guarantee the quality of the clustering results under the premise of satisfying the local differential privacy.

中文翻译：

具有本地差分隐私的混合数据聚类的隐私保护机制

在大数据挖掘中，K-原型由于其简单性和高效性已成为混合数据的流行聚类方法。然而，由于用户数据通常包含敏感信息，K-prototypes方法的数据聚类过程会导致用户隐私泄露的风险。为了解决这个问题，一般的解决方案是在聚类分析中引入可信第三方模型来保护隐私，但现实中很难找到完全可信的实体。在本文中，我们提出了一种本地差分隐私K原型（LDPK）机制，该机制不需要任何可信第三方对用户数据进行隐私预处理。我们的机制首先使用本地差分隐私来干扰用户数据，然后通过服务器和用户之间的交互来完成聚类。此外，我们通过扩展LDPK机制，提出了隐私保护增强机制（ELDPK），在每次迭代中打乱用户的聚类信息，以进一步保护用户隐私。理论分析证明了我们提出的方案的隐私性和可行性，实验结果证明我们提出的方案能够在满足局部差分隐私的前提下保证聚类结果的质量。

更新日期：2021-07-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文