当前位置: X-MOL 学术Inf. Softw. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analyzing privacy policies through syntax-driven semantic analysis of information types
Information and Software Technology ( IF 3.9 ) Pub Date : 2021-05-03 , DOI: 10.1016/j.infsof.2021.106608
Mitra Bokaei Hosseini , Travis D. Breaux , Rocky Slavin , Jianwei Niu , Xiaoyin Wang

Context:

Several government laws and app markets, such as Google Play, require the disclosure of app data practices to users. These data practices constitute critical privacy requirements statements, since they underpin the app’s functionality while describing how various personal information types are collected, used, and with whom they are shared.

Objective:

Abstract and ambiguous terminology in requirements statements concerning information types (e.g., “we collect your device information”), can reduce shared understanding among app developers, policy writers, and users.

Method:

To address this challenge, we propose a syntax-driven method that first parses a given information type phrase (e.g. mobile device identifier) into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules. The inferred semantic relationships between a given phrase and its constituents generate a hierarchy that models the generality and ambiguity of phrases. Through this method, we infer relations from a lexicon consisting of a set of information type phrases to populate a partial ontology. The resulting ontology is a knowledge graph that can be used to guide requirements authors in the selection of the most appropriate information type terms.

Results:

We evaluate the method’s performance using two criteria: (1) expert assessment of relations between information types; and (2) non-expert preferences for relations between information types. The results suggest performance improvement when compared to a previously proposed method. We also evaluate the reliability of the method considering the information types extracted from different data practices (e.g., collection, usage, sharing, etc.) in privacy policies for mobile or web-based apps in various app domains.

Contributions:

The method achieves average of 89% precision and 87% recall considering information types from various app domains and data practices. Due to these results, we conclude that the method can be generalized reliably in inferring relations and reducing the ambiguity and abstraction in privacy policies.



中文翻译:

通过语法驱动的信息类型语义分析来分析隐私策略

语境:

一些政府法律和应用市场(例如Google Play)要求向用户披露应用数据惯例。这些数据做法构成了关键的隐私要求声明,因为它们在描述如何收集,使用以及与之共享各种个人信息类型的同时,也支撑了应用程序的功能。

客观的:

有关信息类型(例如,“我们收集您的设备信息”)的需求声明中的抽象和模棱两可的术语会减少应用程序开发人员,策略编写者和用户之间的共识。

方法:

为了解决这一挑战,我们提出了一种语法驱动的方法,该方法首先使用上下文无关的语法将给定的信息类型短语(例如,移动设备标识符)解析为其组成部分,然后使用语义规则来推断组成部分之间的语义关系。给定短语及其组成部分之间的推断语义关系生成了一个层次结构,该层次结构对短语的普遍性和歧义性进行建模。通过这种方法,我们从由一组信息类型短语组成的词典中推断出关系,以填充部分本体。产生的本体是一个知识图,可用于指导需求作者选择最合适的信息类型术语。

结果:

我们使用两个标准评估该方法的性能:(1)专家评估信息类型之间的关系;(2)信息类型之间关系的非专家偏好。结果表明,与以前提出的方法相比,性能有所提高。我们还考虑了在不同应用程序域中针对基于移动或基于Web的应用程序的隐私策略中从不同数据实践(例如,收集,使用,共享等)中提取的信息类型,评估了该方法的可靠性。

贡献:

考虑到来自各种应用领域和数据实践的信息类型,该方法的平均精度达到89%,召回率达到87%。由于这些结果,我们得出结论,该方法可以在推断关系并减少隐私策略中的歧义和抽象方面可靠地推广。

更新日期:2021-05-12
down
wechat
bug