当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Overlook: Differentially Private Exploratory Visualization for Big Data
arXiv - CS - Databases Pub Date : 2020-06-22 , DOI: arxiv-2006.12018
Pratiksha Thaker and Mihai Budiu and Parikshit Gopalan and Udi Wieder and Matei Zaharia

Data exploration systems that provide differential privacy must manage a privacy budget that measures the amount of privacy lost across multiple queries. One effective strategy to manage the privacy budget is to compute a one-time private synopsis of the data, to which users can make an unlimited number of queries. However, existing systems using synopses are built for offline use cases, where a set of queries is known ahead of time and the system carefully optimizes a synopsis for it. The synopses that these systems build are costly to compute and may also be costly to store. We introduce Overlook, a system that enables private data exploration at interactive latencies for both data analysts and data curators. The key idea in Overlook is a virtual synopsis that can be evaluated incrementally, without extra space storage or expensive precomputation. Overlook simply executes queries using an existing engine, such as a SQL DBMS, and adds noise to their results. Because Overlook's synopses do not require costly precomputation or storage, data curators can also use Overlook to explore the impact of privacy parameters interactively. Overlook offers a rich visual query interface based on the open source Hillview system. Overlook achieves accuracy comparable to existing synopsis-based systems, while offering better performance and removing the need for extra storage.

中文翻译:

忽略:大数据的差异化私有探索性可视化

提供差异隐私的数据探索系统必须管理隐私预算,以衡量多个查询中丢失的隐私量。管理隐私预算的一种有效策略是计算数据的一次性私有概要,用户可以对其进行无限数量的查询。但是,使用概要的现有系统是为离线用例构建的,其中一组查询是提前已知的,并且系统会为其仔细优化概要。这些系统构建的概要计算成本高,存储成本也高。我们介绍了 Overlook,这是一个系统,可以为数据分析师和数据管理员在交互式延迟下进行私人数据探索。Overlook 的关键思想是一个可以增量评估的虚拟概要,无需额外的空间存储或昂贵的预计算。Overlook 只是使用现有引擎(例如 SQL DBMS)执行查询,并在结果中添加干扰。由于 Overlook 的概要不需要昂贵的预计算或存储,因此数据管理者还可以使用 Overlook 以交互方式探索隐私参数的影响。Overlook 提供了基于开源 Hillview 系统的丰富的可视化查询界面。Overlook 的准确性可与现有的基于概要的系统相媲美,同时提供更好的性能并消除对额外存储的需求。数据管理者还可以使用 Overlook 以交互方式探索隐私参数的影响。Overlook 提供了基于开源 Hillview 系统的丰富的可视化查询界面。Overlook 的准确性可与现有的基于概要的系统相媲美,同时提供更好的性能并消除对额外存储的需求。数据管理者还可以使用 Overlook 以交互方式探索隐私参数的影响。Overlook 提供了基于开源 Hillview 系统的丰富的可视化查询界面。Overlook 的准确性可与现有的基于概要的系统相媲美,同时提供更好的性能并消除对额外存储的需求。
更新日期:2020-06-23
down
wechat
bug