当前位置: X-MOL 学术Comput. Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cost-based Recommendation of Parameters for Local Differentially Private Data Aggregation
Computers & Security ( IF 5.6 ) Pub Date : 2021-03-01 , DOI: 10.1016/j.cose.2020.102144
Snehkumar Shahani , R Venkateswaran , Jibi Abraham

Abstract The ability to analyze personal data for a group of individuals without compromising their respective privacy has been a focus of significant research in recent years. For such analyses, data analysts need to acquire data from individuals without revealing their Individually Identifiable Data (IID). Well established Differentially Private techniques, characterized by privacy parameters ( ϵ , δ ) , transform the data to protect the IID. However, such transformations adversely affect the usefulness of data leading to a trade-off between usefulness and privacy. Therefore, negotiating appropriate values of privacy parameters before data acquisition is a challenging task for data analysts. Most of the work, in selecting values of privacy parameters, is either based on constraining all other parameters or they provide a set of acceptable values. Here also the problem of selecting the best value from the set of acceptable values is left to the analyst. A major contribution of this paper is the method of identifying the best value of privacy parameters in a trade-off between usefulness and privacy by introducing a cost-based model, thereby addressing the issue. To enable estimation of usefulness and its cost before data acquisition, we have mathematically modeled utility in terms of data and privacy parameters. We have considered standard statistical aggregates such as Sum, Mean and Standard Deviation as compared to most of the existing works that consider only Count query as aggregate analysis. The correctness of our mathematical estimation has been validated on a diverse set of synthetic and real-world datasets spanning popular data distributions.

中文翻译:

基于成本的本地差异私有数据聚合参数推荐

摘要 近年来,在不损害各自隐私的情况下分析一组个人的个人数据的能力一直是重要研究的焦点。对于此类分析,数据分析师需要在不透露个人身份数据 (IID) 的情况下从个人获取数据。完善的差分私有技术,以隐私参数 ( ϵ , δ ) 为特征,转换数据以保护 IID。然而,这种转换会对数据的有用性产生不利影响,从而导致有用性和隐私之间的权衡。因此,在数据采集之前协商适当的隐私参数值对于数据分析师来说是一项具有挑战性的任务。大多数工作,在选择隐私参数的值时,要么基于约束所有其他参数,要么提供一组可接受的值。在这里,从一组可接受的值中选择最佳值的问题也留给了分析师。本文的一个主要贡献是通过引入基于成本的模型,在有用性和隐私之间的权衡中确定隐私参数的最佳值的方法,从而解决了这个问题。为了在数据获取之前估计有用性及其成本,我们在数据和隐私参数方面对效用进行了数学建模。与仅将 Count 查询作为聚合分析的大多数现有工作相比,我们已经考虑了标准统计聚合,例如 Sum、Mean 和 Standard Deviation。我们的数学估计的正确性已在涵盖流行数据分布的各种合成和真实世界数据集上得到验证。
更新日期:2021-03-01
down
wechat
bug