当前位置: X-MOL 学术Comput. Math. Organ. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Disinformation: analysis and identification
Computational and Mathematical Organization Theory ( IF 1.8 ) Pub Date : 2021-06-18 , DOI: 10.1007/s10588-021-09336-x
Archita Pathak 1, 2 , Rohini K Srihari 1 , Nihit Natu 1, 3
Affiliation  

We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions:

  • Can we automatically and accurately classify a news article as containing disinformation?

  • What characteristics of disinformation differentiate it from other types of benign information?

We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation.



中文翻译:

虚假信息:分析和识别

我们对虚假信息进行了广泛的研究,虚假信息被定义为虚假和误导性的信息,并且故意共享以造成伤害。通过这项工作,我们旨在回答以下问题:

  • 我们能否自动准确地将新闻文章归类为包含虚假信息?

  • 虚假信息的哪些特征将其与其他类型的良性信息区分开来?

我们在两个重大事件的背景下进行这项研究:2016 年美国大选和 2020 年 COVID 大流行。我们构建了一系列分类器来(i)检查不同类型的假新闻文章所表现出的语言线索,(ii)分析虚假标题的“点击率”,以及(iii)最后,通过用于自动虚假信息验证的自然语言推理 (NLI) 模块;这利用了一组手动策划的证据来源。对于后者,我们构建了一个新的数据集,该数据集使用通用的、基于准确性的标签和支持每个标签的地面实况证据进行注释。真实性标签是根据知名的事实核查机构使用的审查标准制定的。我们表明,虚假信息同时具有宣传和主流新闻的特征,这使得检测变得更具挑战性。然而,将事实核查过程自动化以结合真实程度具有很大的潜力。我们提供错误分析,说明自动事实检查任务所涉及的挑战,并确定可能在未来工作中改进此过程的因素。最后,我们还描述了一个网络应用程序的实现,它从给定的文章中提取重要的实体和动作,并搜索网络以从可靠的来源收集证据。然后使用证据文章生成一个真实性标签,该标签可以帮助人工事实核查人员打击虚假信息。将事实核查过程自动化以结合真实程度具有很大的潜力。我们提供错误分析,说明自动事实检查任务所涉及的挑战,并确定可能在未来工作中改进此过程的因素。最后,我们还描述了一个网络应用程序的实现,它从给定的文章中提取重要的实体和动作,并搜索网络以从可靠的来源收集证据。然后使用证据文章生成一个真实性标签,该标签可以帮助人工事实核查人员打击虚假信息。将事实核查过程自动化以结合真实程度具有很大的潜力。我们提供错误分析,说明自动事实检查任务所涉及的挑战,并确定可能在未来工作中改进此过程的因素。最后,我们还描述了一个网络应用程序的实现,它从给定的文章中提取重要的实体和动作,并搜索网络以从可靠的来源收集证据。然后使用证据文章生成一个真实性标签,该标签可以帮助人工事实核查人员打击虚假信息。我们还描述了一个网络应用程序的实现,它从给定的文章中提取重要的实体和动作,并搜索网络以从可靠的来源收集证据。然后使用证据文章生成一个真实性标签,该标签可以帮助人工事实核查人员打击虚假信息。我们还描述了一个网络应用程序的实现,它从给定的文章中提取重要的实体和动作,并搜索网络以从可靠的来源收集证据。然后使用证据文章生成一个真实性标签,该标签可以帮助人工事实核查人员打击虚假信息。

更新日期:2021-06-18
down
wechat
bug