Analyzing Privacy Policies at Scale,ACM Transactions on the Web

当前位置： X-MOL 学术 › ACM Trans. Web › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Analyzing Privacy Policies at Scale
ACM Transactions on the Web ( IF 2.6 ) Pub Date : 2018-12-04 , DOI: 10.1145/3230665
Shomir Wilson ₁ , Florian Schaub ₁ , Frederick Liu ₁ , Kanthashree Mysore Sathyendra ₁ , Daniel Smullen ₁ , Sebastian Zimmeck ₁ , Rohan Ramanath ₁ , Peter Story ₁ , Fei Liu ₁ , Norman Sadeh ₁ , Noah A. Smith ₂

Affiliation

Website privacy policies are often long and difficult to understand. While research shows that Internet users care about their privacy, they do not have the time to understand the policies of every website they visit, and most users hardly ever read privacy policies. Some recent efforts have aimed to use a combination of crowdsourcing, machine learning, and natural language processing to interpret privacy policies at scale, thus producing annotations for use in interfaces that inform Internet users of salient policy details. However, little attention has been devoted to studying the accuracy of crowdsourced privacy policy annotations, how crowdworker productivity can be enhanced for such a task, and the levels of granularity that are feasible for automatic analysis of privacy policies. In this article, we present a trajectory of work addressing each of these topics. We include analyses of crowdworker performance, evaluation of a method to make a privacy-policy oriented task easier for crowdworkers, a coarse-grained approach to labeling segments of policy text with descriptive themes, and a fine-grained approach to identifying user choices described in policy text. Together, the results from these efforts show the effectiveness of using automated and semi-automated methods for extracting from privacy policies the data practice details that are salient to Internet users’ interests.

中文翻译：

大规模分析隐私政策

网站隐私政策通常冗长且难以理解。虽然研究表明互联网用户关心他们的隐私，但他们没有时间了解他们访问的每个网站的政策，而且大多数用户几乎从未阅读过隐私政策。最近的一些努力旨在结合使用众包、机器学习和自然语言处理来大规模解释隐私政策，从而生成用于通知互联网用户重要政策细节的界面的注释。然而，很少有人关注研究众包隐私政策注释的准确性、如何提高众包工作人员的工作效率以及自动分析隐私政策的可行粒度级别。在本文中，我们提出了解决这些主题的工作轨迹。我们包括对众包绩效的分析、对让众包工作者更容易执行面向隐私策略的任务的方法的评估、用描述性主题标记策略文本段的粗粒度方法，以及识别用户选择的细粒度方法政策文本。总之，这些努力的结果表明，使用自动化和半自动化方法从隐私政策中提取对互联网用户兴趣重要的数据实践细节是有效的。以及识别策略文本中描述的用户选择的细粒度方法。总之，这些努力的结果表明，使用自动化和半自动化方法从隐私政策中提取对互联网用户兴趣重要的数据实践细节是有效的。以及识别策略文本中描述的用户选择的细粒度方法。总之，这些努力的结果表明，使用自动化和半自动化方法从隐私政策中提取对互联网用户兴趣重要的数据实践细节是有效的。

更新日期：2018-12-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11