Evaluating transfer learning approach for detecting Arabic anti-refugee/migrant speech on social media,Aslib Journal of Information Management

当前位置： X-MOL 学术 › Aslib Journal of Information Management › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evaluating transfer learning approach for detecting Arabic anti-refugee/migrant speech on social media
Aslib Journal of Information Management ( IF 2.6 ) Pub Date : 2022-03-22 , DOI: 10.1108/ajim-10-2021-0293
Djamila Mohdeb ₁ , Meriem Laifa ₂ , Fayssal Zerargui ₁ , Omar Benzaoui ₁

Affiliation

Purpose

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African refugees and illegal migrants on the YouTube Algerian space.

Design/methodology/approach

The transfer learning approach which recently presents the state-of-the-art approach in natural language processing tasks has been exploited to classify and detect hate speech in Algerian dialectal Arabic. Besides, a descriptive analysis has been conducted to answer the analytical research questions that aim at measuring and evaluating the presence of the anti-refugee/migrant discourse on the YouTube social platform.

Findings

Data analysis revealed that there has been a gradual modest increase in the number of anti-refugee/migrant hateful comments on YouTube since 2014, a sharp rise in 2017 and a sharp decline in later years until 2021. Furthermore, our findings stemming from classifying hate content using multilingual and monolingual pre-trained language transformers demonstrate a good performance of the AraBERT monolingual transformer in comparison with the monodialectal transformer DziriBERT and the cross-lingual transformers mBERT and XLM-R.

Originality/value

Automatic hate speech detection in languages other than English is quite a challenging task that the literature has tried to address by various approaches of machine learning. Although the recent approach of cross-lingual transfer learning offers a promising solution, tackling this problem in the context of the Arabic language, particularly dialectal Arabic makes it even more challenging. Our results cast a new light on the actual ability of the transfer learning approach to deal with low-resource languages that widely differ from high-resource languages as well as other Latin-based, low-resource languages.

中文翻译：

评估迁移学习方法以检测社交媒体上的阿拉伯语反难民/移民言论

目的

本研究旨在调查八个研究问题，这些问题与分析和检测在 YouTube 阿尔及利亚空间上针对非洲难民和非法移民的方言阿拉伯语仇恨言论有关。

设计/方法/方法

最近提出了自然语言处理任务中最先进方法的迁移学习方法已被用于分类和检测阿尔及利亚方言阿拉伯语中的仇恨言论。此外，还进行了描述性分析，以回答旨在衡量和评估 YouTube 社交平台上反难民/移民话语存在的分析性研究问题。

发现

数据分析显示，自 2014 年以来，YouTube 上的反难民/移民仇恨评论数量逐渐适度增加，2017 年急剧上升，直到 2021 年之后几年急剧下降。此外，我们的发现源于对仇恨进行分类与单方言转换器 DziriBERT 和跨语言转换器 mBERT 和 XLM-R 相比，使用多语言和单语预训练语言转换器的内容展示了 AraBERT 单语转换器的良好性能。

原创性/价值

以英语以外的语言自动检测仇恨言论是一项相当具有挑战性的任务，文献试图通过各种机器学习方法来解决这个问题。尽管最近的跨语言迁移学习方法提供了一个有前途的解决方案，但在阿拉伯语，特别是方言阿拉伯语的背景下解决这个问题使其更具挑战性。我们的结果为迁移学习方法处理与高资源语言以及其他基于拉丁语的低资源语言大不相同的低资源语言的实际能力提供了新的视角。

更新日期：2022-03-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>