The Trade-off between Privacy and Fidelity via Ehrhart Theory,IEEE Transactions on Information Theory

当前位置： X-MOL 学术 › IEEE Trans. Inform. Theory › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Trade-off between Privacy and Fidelity via Ehrhart Theory
IEEE Transactions on Information Theory ( IF 2.2 ) Pub Date : 2020-04-01 , DOI: 10.1109/tit.2019.2959976
Arun Padakandla , P. R. Kumar , Wojciech Szpankowski

As an increasing amount of data is gathered nowadays and stored in databases, the question arises of how to protect the privacy of individual records in a database even while providing accurate answers to queries on the database. Differential Privacy (DP) has gained acceptance as a framework to quantify vulnerability of algorithms to privacy breaches. We consider the problem of how to sanitize an entire database via a DP mechanism, on which unlimited further querying is performed. While protecting privacy, it is important that the sanitized database still provide accurate responses to queries. The central contribution of this work is to characterize the amount of information preserved in an optimal DP database sanitizing mechanism (DSM). We precisely characterize the utility-privacy trade-off of mechanisms that sanitize databases in the asymptotic regime of large databases. We study this in an information-theoretic framework by modeling a generic distribution on the data, and a measure of fidelity between the histograms of the original and sanitized databases. We consider the popular $\mathbb {L}_{1}-$ distortion metric, i.e., the total variation norm that leads to the formulation as a linear program (LP). This optimization problem is prohibitive in complexity with the number of constraints growing exponentially in the parameters of the problem. Our focus on the asymptotic regime enables us characterize precisely, the limit of the sequence of solutions to this optimization problem. Leveraging tools from discrete geometry, analytic combinatorics, and duality theorems of optimization, we fully characterize this limit in terms of a power series whose coefficients are the number of integer points on a multidimensional convex cross-polytope studied by Ehrhart in 1967. Employing Ehrhart theory, we determine a simple closed form computable expression for the asymptotic growth of the optimal privacy-fidelity trade-off to infinite precision. At the heart of the findings is a deep connection between the minimum expected distortion and a fundamental construct in Ehrhart theory - Ehrhart series of an integral convex polytope.

中文翻译：

通过埃尔哈特理论在隐私和保真度之间进行权衡

随着如今越来越多的数据被收集并存储在数据库中，出现了如何保护数据库中单个记录的隐私的问题，同时提供对数据库查询的准确答案。差分隐私 (DP) 作为量化算法对隐私泄露的脆弱性的框架已获得认可。我们考虑如何通过 DP 机制清理整个数据库的问题，在该机制上执行无限制的进一步查询。在保护隐私的同时，经过消毒的数据库仍然可以对查询提供准确的响应，这一点很重要。这项工作的主要贡献是表征保存在最佳 DP 数据库清理机制 (DSM) 中的信息量。我们精确地描述了在大型数据库的渐近机制中清理数据库的机制的效用 - 隐私权衡。我们通过对数据的通用分布进行建模，以及原始数据库和经过清理的数据库的直方图之间的保真度度量，在信息理论框架中对此进行了研究。我们考虑流行的 $\mathbb {L}_{1}-$ 失真度量，即导致公式为线性程序 (LP) 的总变异范数。由于约束的数量在问题的参数中呈指数增长，因此该优化问题的复杂性令人望而却步。我们对渐近机制的关注使我们能够精确地描述这个优化问题的解序列的极限。利用离散几何、分析组合学、和优化的对偶定理，我们根据一个幂级数来完全刻画这个极限，该幂级数的系数是 Ehrhart 在 1967 年研究的多维凸交叉多面体上的整数点数。利用 Ehrhart 理论，我们确定了一个简单的封闭形式的可计算表达式将最佳隐私保真度权衡渐近增长到无限精度。研究结果的核心是最小预期失真与 Ehrhart 理论中的一个基本结构 - 积分凸多面体的 Ehrhart 级数之间的深层联系。我们确定了一个简单的封闭形式的可计算表达式，用于将最优隐私保真度权衡渐近增长到无限精度。研究结果的核心是最小预期失真与 Ehrhart 理论中的一个基本结构 - 积分凸多面体的 Ehrhart 级数之间的深层联系。我们确定了一个简单的封闭形式的可计算表达式，用于将最优隐私保真度权衡渐近增长到无限精度。研究结果的核心是最小预期失真与 Ehrhart 理论中的一个基本结构 - 积分凸多面体的 Ehrhart 级数之间的深层联系。

更新日期：2020-04-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11