Computational and Mathematical Organization Theory ( IF 1.8 ) Pub Date : 2021-04-26 , DOI: 10.1007/s10588-021-09329-w Lynnette Hui Xian Ng 1 , Kathleen M Carley 1
The 2020 coronavirus pandemic has heightened the need to flag coronavirus-related misinformation, and fact-checking groups have taken to verifying misinformation on the Internet. We explore stories reported by fact-checking groups PolitiFact, Poynter and Snopes from January to June 2020. We characterise these stories into six clusters, then analyse temporal trends of story validity and the level of agreement across sites. The sites present the same stories 78% of the time, with the highest agreement between Poynter and PolitiFact. We further break down the story clusters into more granular story types by proposing a unique automated method, which can be used to classify diverse story sources in both fact-checked stories and tweets. Our results show story type classification performs best when trained on the same medium, with contextualised BERT vector representations outperforming a Bag-Of-Words classifier.
中文翻译:
“冠状病毒是一种生物武器”:在事实核查网站上对冠状病毒故事进行分类
2020 年的冠状病毒大流行增加了标记与冠状病毒相关的错误信息的必要性,事实核查小组已开始核实互联网上的错误信息。我们探索事实核查小组 PolitiFact、Poynter 和 Snopes 在 2020 年 1 月至 2020 年 6 月期间报告的故事。我们将这些故事分为六个集群,然后分析故事有效性的时间趋势和跨站点的一致性水平。这些网站 78% 的时间呈现相同的故事,Poynter 和 PolitiFact 之间的一致性最高。我们通过提出一种独特的自动化方法进一步将故事集群分解为更细粒度的故事类型,该方法可用于对经过事实核查的故事和推文中的不同故事来源进行分类。我们的结果显示故事类型分类在相同的媒体上训练时表现最好,