当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter
arXiv - CS - Information Retrieval Pub Date : 2021-01-09 , DOI: arxiv-2101.05626
Sarah Alqurashi, Btool Hamoui, Abdulaziz Alashaikh, Ahmad Alhindi, Eisa Alanazi

The rapid growth of social media content during the current pandemic provides useful tools for disseminating information which has also become a root for misinformation. Therefore, there is an urgent need for fact-checking and effective techniques for detecting misinformation in social media. In this work, we study the misinformation in the Arabic content of Twitter. We construct a large Arabic dataset related to COVID-19 misinformation and gold-annotate the tweets into two categories: misinformation or not. Then, we apply eight different traditional and deep machine learning models, with different features including word embeddings and word frequency. The word embedding models (\textsc{FastText} and word2vec) exploit more than two million Arabic tweets related to COVID-19. Experiments show that optimizing the area under the curve (AUC) improves the models' performance and the Extreme Gradient Boosting (XGBoost) presents the highest accuracy in detecting COVID-19 misinformation online.

中文翻译:

吃大蒜可预防COVID-19感染:在Twitter的阿拉伯内容上检测错误信息

在当前大流行期间,社交媒体内容的快速增长提供了有用的工具来传播信息,这也已成为错误信息的根源。因此,迫切需要用于检查社交媒体中的错误信息的事实检查和有效技术。在这项工作中,我们研究了Twitter阿拉伯语内容中的错误信息。我们构建了一个与COVID-19错误信息和金色注释推文相关的大型阿拉伯数据集,分为两类:信息错误与否。然后,我们应用八个不同的传统和深度机器学习模型,它们具有包括单词嵌入和单词频率在内的不同功能。单词嵌入模型(\ textsc {FastText}和word2vec)利用了超过200万条与COVID-19相关的阿拉伯推文。
更新日期:2021-01-15
down
wechat
bug