Pseudonymization risk analysis in distributed systems,Journal of Internet Services and Applications

当前位置： X-MOL 学术 › J. Internet Serv. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Pseudonymization risk analysis in distributed systems
Journal of Internet Services and Applications ( IF 2.4 ) Pub Date : 2019-01-08 , DOI: 10.1186/s13174-018-0098-z
Geoffrey K. Neumann , Paul Grace , Daniel Burns , Mike Surridge

In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.

中文翻译：

分布式系统中的假名化风险分析

在大数据时代，在线服务正变得越来越以数据为中心。他们以假名数据集的形式收集，处理，分析和匿名披露数量不断增长的个人数据。至关重要的是，此类系统的设计既要保护个人用户（数据主体）的隐私，又要把对个人数据的控制权交还给用户。就假名数据而言，这意味着不需要的个人不应能够推断出有关用户的敏感信息。但是，当前存在的大量假名化算法和可调参数使非专业开发人员（数据控制器）难以理解和实现强大的隐私保证。在本文中，我们提出了一个原则化的模型驱动工程（MDE）框架，以便根据数据服务的假名化策略对它们进行建模，并确定破坏用户隐私的风险。开发人员可以探索可替代的假名化策略，以根据可量化的指标确定其假名化策略的有效性：i）违反当前数据集中每个用户的隐私要求；ii）在符合这些要求和数据用于预期目的之间的权衡。我们通过实验评估证明，该框架提供的信息是有用的，尤其是在复杂情况下，不同用户的隐私要求不同，

更新日期：2019-01-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文