Identifying Disinformation Websites Using Infrastructure Features,arXiv - CS - Computers and Society

当前位置： X-MOL 学术 › arXiv.cs.CY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Identifying Disinformation Websites Using Infrastructure Features
arXiv - CS - Computers and Society Pub Date : 2020-02-28 , DOI: arxiv-2003.07684
Austin Hounsel, Jordan Holland, Ben Kaiser, Kevin Borgolte, Nick Feamster, Jonathan Mayer

Platforms have struggled to keep pace with the spread of disinformation. Current responses like user reports, manual analysis, and third-party fact checking are slow and difficult to scale, and as a result, disinformation can spread unchecked for some time after being created. Automation is essential for enabling platforms to respond rapidly to disinformation. In this work, we explore a new direction for automated detection of disinformation websites: infrastructure features. Our hypothesis is that while disinformation websites may be perceptually similar to authentic news websites, there may also be significant non-perceptual differences in the domain registrations, TLS/SSL certificates, and web hosting configurations. Infrastructure features are particularly valuable for detecting disinformation websites because they are available before content goes live and reaches readers, enabling early detection. We demonstrate the feasibility of our approach on a large corpus of labeled website snapshots. We also present results from a preliminary real-time deployment, successfully discovering disinformation websites while highlighting unexplored challenges for automated disinformation detection.

中文翻译：

使用基础设施功能识别虚假信息网站

平台一直在努力跟上虚假信息的传播。当前的响应（如用户报告、手动分析和第三方事实检查）缓慢且难以扩展，因此，虚假信息在创建后可能会在一段时间内未经检查地传播。自动化对于使平台能够快速响应虚假信息至关重要。在这项工作中，我们探索了自动检测虚假信息网站的新方向：基础设施功能。我们的假设是，虽然虚假信息网站可能在感知上与真实新闻网站相似，但域注册、TLS/SSL 证书和网络托管配置也可能存在显着的非感知差异。基础设施功能对于检测虚假信息网站特别有价值，因为它们在内容上线并到达读者之前可用，从而能够及早发现。我们证明了我们的方法在大量标记的网站快照语料库上的可行性。我们还展示了初步实时部署的结果，成功发现了虚假信息网站，同时突出了自动化虚假信息检测的未探索挑战。

更新日期：2020-09-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>