当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DP-UserPro: differentially private user profile construction and publication
Frontiers of Computer Science ( IF 4.2 ) Pub Date : 2021-06-04 , DOI: 10.1007/s11704-020-9462-9
Zheng Huo , Ping He , Lisha Hu , Huanyu Zhao

User profiles are widely used in the age of big data. However, generating and releasing user profiles may cause serious privacy leakage, since a large number of personal data are collected and analyzed. In this paper, we propose a differentially private user profile construction method DP-UserPro, which is composed of DP-CLIQUE and privately top-k tags selection. DP-CLIQUE is a differentially private high dimensional data cluster algorithm based on CLIQUE. The multidimensional tag space is divided into cells, Laplace noises are added into the count value of each cell. Based on the breadth-first-search, the largest connected dense cells are clustered into a cluster. Then a privately top-k tags selection approach is proposed based on the score function of each tag, to select the most important k tags which can represent the characteristics of the cluster. Privacy and utility of DP-UserPro are theoretically analyzed and experimentally evaluated in the last. Comparison experiments are carried out with Tag Suppression algorithm on two real datasets, to measure the False Negative Rate (FNR) and precision. The results show that DP-UserPro outperforms Tag Suppression by 62.5% in the best case and 14.25% in the worst case on FNR, and DP-UserPro is about 21.1% better on precision than that of Tag Suppression, in average.



中文翻译:

DP-UserPro:差异化私有用户配置文件构建和发布

用户画像在大数据时代被广泛使用。但是,生成和发布用户个人资料可能会导致严重的隐私泄露,因为会收集和分析大量个人数据。在本文中,我们提出了一种差异私有用户配置文件构建方法DP-UserPro,它由DP-CLIQUE和私有top- k标签选择组成。DP-CLIQUE 是一种基于 CLIQUE 的差分私有高维数据聚类算法。将多维标签空间划分为单元格,在每个单元格的计数值中加入拉普拉斯噪声。基于广度优先搜索,将最大的连接密集单元聚集成一个簇。然后私下顶-k提出了基于每个标签的得分函数的标签选择方法,选择最重要的k个能够代表集群特征的标签。最后对DP-UserPro的隐私性和实用性进行了理论分析和实验评估。使用标签抑制算法在两个真实数据集上进行比较实验,以测量误报率( FNR ) 和精度。结果表明,DP-UserPro在 FNR 的最佳情况下优于Tag Suppression 62.5%,在最坏情况下优于Tag Suppression 14.25%,并且 DP-UserPro 在精度上Tag Suppression平均提高约 21.1% 。

更新日期:2021-06-04
down
wechat
bug