当前位置: X-MOL 学术IET Inf. Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Privacy-preserving Constrained Spectral Clustering Algorithm for Large-scale Data Sets
IET Information Security ( IF 1.3 ) Pub Date : 2020-05-01 , DOI: 10.1049/iet-ifs.2019.0255
Ji Li 1, 2 , Jianghong Wei 1, 2 , Mao Ye 3 , Wenfen Liu 4 , Xuexian Hu 1
Affiliation  

With the increasing concern on the preservation of personal privacy, privacy-preserving data mining has become a hot topic in recent years. Spectral clustering is one of the most widely used clustering algorithm for exploratory data analysis and usually has to deal with sensitive data sets. How to conduct privacy-preserving spectral clustering is an urgent problem to be solved. In this study, the authors focus on introducing the notion of differential privacy, which is considered as the de facto standard of privacy-preserving data analysis, into spectral clustering. Specifically, by combining the well-studied constrained spectral clustering with the Wishart mechanism in a novel way, the authors propose a differentially private constrained spectral clustering (DP-CSC) algorithm. The DP-CSC algorithm is proved to capture asymptotic property and achieves ϵ -differential privacy. To illustrate the effectiveness and efficiency of DP-CSC, the authors conduct experiments on five real-word data sets. The results indicate that the DP-CSC algorithm can provide acceptable clustering accuracy with short running time while preserving individual privacy.

中文翻译:

大规模数据集的隐私保护约束谱聚类算法

随着对个人隐私保护的日益关注,保护隐私的数据挖掘已成为近年来的热门话题。频谱聚类是探索性数据分析中使用最广泛的聚类算法之一,通常必须处理敏感数据集。如何进行保护隐私的频谱聚类是亟待解决的问题。在这项研究中,作者着重于将差异隐私的概念引入光谱聚类中,该概念被认为是隐私保护数据分析的事实上的标准。具体来说,通过以新颖的方式将经过充分研究的约束频谱聚类与Wishart机制相结合,作者提出了差分私有约束频谱聚类(DP-CSC)算法。证明了DP-CSC算法能够捕获渐近性质并实现diff差分隐私。为了说明DP-CSC的有效性和效率,作者对五个实词数据集进行了实验。结果表明,DP-CSC算法可以在不影响个人隐私的前提下,以较短的运行时间提供可接受的聚类精度。
更新日期:2020-05-01
down
wechat
bug