A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets

Duwairi, Rehab; Hayajneh, Amena; Quwaider, Muhannad

doi:10.1007/s13369-021-05383-3

A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets

Research Article-Computer Engineering and Computer Science
Published: 05 February 2021

Volume 46, pages 4001–4014, (2021)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

1501 Accesses
27 Citations
Explore all metrics

Abstract

In this paper, we investigate the ability of CNN, CNN-LSTM, and BiLSTM-CNN deep learning networks to automatically classify or discover hateful content posted on social media. These deep networks were trained and tested using ArHS dataset which consists of 9833 tweets that were annotated to suite hateful speech detection in Arabic. To the best of our knowledge, this is the largest Arabic dataset which handles the subclasses of hate speech. Moreover, we investigate the performance on two existing Arabic hate speech datasets along with ArHS dataset resulting in a combined dataset which consists of 23,678 tweets. Three types of experiment are reported: first, the binary classification of tweets into Hate or Normal, second, ternary classification of tweets into (Hate, Abusive, or Normal), and lastly, multi-class classification of tweets into (Misogyny, Racism, Religious Discrimination, Abusive, and Normal). Using the ArHS dataset, in the binary classification task, the CNN model outperformed other models and achieved an accuracy of 81%. In the ternary classification task, both the CNN and BiLSTM-CNN models achieved the best accuracy of 74%. Lastly, in the multi-class classification task, CNN-LSTM and the BiLSTM-CNN models both achieved the best results with an accuracy of 73%. On the Combined dataset, in the binary classification task, the BiLSTM-CNN achieved an accuracy of 73%. In the ternary classification task, BiLSTM-CNN achieved the best accuracy of 67%. Lastly, in the multi-class classification task, the CNN-LSTM and the BiLSTM-CNN achieved the best accuracy of 65%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of hate speech in Arabic tweets using deep learning

Article 21 January 2021

arHateDetector: detection of hate speech from standard and dialectal Arabic Tweets

Article Open access 20 March 2023

Levantine hate speech detection in twitter

Article 29 August 2022

References

Titley, G.; Keen, E.; Földi, L.: Starting points for combating hate speech online. Council of Europe (2014)
Schmidt, A.,;Wiegand, M.: A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp. 1–10 (2017)
McGonagle, T., et al.: The council of europe against online hate speech: Conundrums and challenges. In: Expert Paper. Belgrade: Council of Europe Conference of Ministers Responsible for Media and Information Society (2013)
League, A-D.: Responding to Cyberhate: Toolkit for Action. Anti-Defamation League, New York (2010)
Chetty, N.; Alathur, S.: Hate speech review in the context of online social networks. Aggress. Violent Behav. 40, 108–118 (2018)
Article Google Scholar
Davidson, T.; Warmsley, D.; Macy, M.;, Weber, I.: Automated hate speech detection and the problem of offensive language. In: The 11th International AAAI Conference on Web and Social Media (icwsm-17), Montreal, Canada (2017)
Singh, Amanpreet; Kaur, M.: Detection framework for content-based cybercrime in online social networks using metaheuristic approach. Arab. J. Sci. Eng. 45(4), 2705–2719 (2020)
Article Google Scholar
Mathew, B., Dutt, R., Goyal, P., Mukherjee, A.: Spread of hate speech in online social media. In: Proceedings of the 10th ACM Conference on Web Science, pp. 173–182 (2019)
Gelber, K.; McNamara, L.: Evidencing the harms of hate speech. Soc. Ident. 22(3), 324–341 (2016)
Article Google Scholar
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)
Gambäck, B.; Sikdar, U.K.: Using convolutional neural networks to classify hate-speech. In: Proceedings of the First Workshop on Abusive Language Online, pp. 85–90 (2017)
Chen, Y.; Zhou, Y.; Zhu, S.; Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80. IEEE (2012)
Badjatiya, P.; Gupta, S.; Gupta, M.; Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)
Ross, B.; Rist, M.; Carbonell, G.; Cabrera, B.; Kurowsky, N.; Wojatzki, M.: Measuring the reliability of hate speech annotations: The case of the European refugee crisis. In: Proceedings of the 3rd Workshop on Natural Language Processing for Computer-Mediated Communication (NLP4CMC) (2017)
Musto, C.; Sansonetti, A.; Polignano, M.; Semeraro, G.; Stranisci.: Associazione ACMOS. Hatechecker: a tool to automatically detect hater users in online social networks. In: CLiC-it (2019)
MacAvaney, S.; Yao, H.-R.; Yang, E.; Russell, K.; Goharian, N.; Frieder, O.: Hate speech detection: challenges and solutions. PloS ONE 14(8), 1 (2019). https://doi.org/10.1371/journal.pone.0221152
Article Google Scholar
De Smedt, T.; Jaki, S.; Kotzé, E.; Saoud, L.; Gwóźdź, M.; De Pauw, G.; Daelemans, W.: Multilingual cross-domain perspectives on online hate speech. CLiPS Techn. Rep. Ser. 8, 1–24 (2018)
Google Scholar
Sanguinetti, M., Poletto, F., Bosco, C., Patti, V., Stranisci, M.: An Italian twitter corpus of hate speech against immigrants. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Mulki, H.; Haddad, H., Ali, C.B.; Alshabani, H.: L-hsab: a levantine twitter dataset for hate speech and abusive language. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 111–118 (2019)
Saeed, H.H.; Calders, T.; Kamiran, F.: Osact4 shared tasks: ensembled stacked classification for offensive and hate speech in arabic tweets. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 71–75 (2020)
Waseem, Z.: Are you a racist or am i seeing things? Annotator influence on hate speech detection on twitter. In: Proceedings of the First Workshop on NLP and Computational Social Science, pp. 138–142 (2016)
Zhang, Z.; Robinson, D.; Tepper, J.: Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European Semantic Web Conference, pp. 745–760. Springer (2018)
Robinson, D.; Zhang, Z.; Tepper, J.: Hate speech detection on twitter: feature engineering vs feature selection. In: European Semantic Web Conference, pp. 46–49. Springer (2018)
Frenda, S.; Somnath, B.: Deep analysis in aggressive mexican tweets. In: Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Ceur Workshop Proceedings, vol. 2150, pp. 108–113 (2018)
Park, J.H., Fung, P.: One-step and two-step classification for abusive language detection on twitter. In: ALW1: 1st Workshop on Abusive Language Online to be Held at the Annual Meeting of the Association of Computational Linguistics (ACL), Vancouver, Canada, August (2017)
Risch, J.; Krestel, R.: Aggression identification using deep learning and data augmentation. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 150–158 (2018)
Gao, Lei: Huang, Ruihong: Detecting online hate speech using context aware models. In Recent Advances in Natural Language Processing, Varna, Bulgaria (2017)
Del Vigna, F.; Cimino, A.; Dell’Orletta, F.; Petrocchi, M.; Tesconi, M.: Hate me, hate me not: hate speech detection on facebook. In: Proceedings of the First Italian Conference on Cybersecurity (ITASEC17), pp. 86–95 (2017)
Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Detecting offensive language in tweets using deep learning. In: Applied Intelligence vol. 48, no. 12, pp. 4730–4742 (2018)
Albadi, N.; Kurdi, M.; Mishra, S.: Are they our brothers? analysis and detection of religious hate speech in the arabic twittersphere. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 69–76. IEEE (2018)
Ousidhoum, N.; Lin, Z.; Zhang, H.; Song, Y.; Yeung, D-Y.: Multilingual and multi-aspect hate speech analysis. arXiv:1908.11049 (2019)
Farha, I.A., Magdy, W.: Multitask learning for Arabic offensive language and hate-speech detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 86–90 (2020)
Osact (2020)
Faris, H.; Aljarah, I.; Habib, M.; Castillo, P.A.: Hate speech detection using word embedding and deep learning in the Arabic language context. In: ICPRAM, pp. 453–460 (2020)
AlGhamdi, M.A.; Khan, M.A.: Intelligent analysis of arabic tweets for detection of suspicious messages. Arab. J. Sci. Eng. 45, 6021–6032 (2020)
Article Google Scholar
Warner, W.; Hirschberg, J.: Detecting hate speech on the world wide web. In: Proceedings of the Second Workshop on Language in Social Media, pp. 19–26 (2012)
de Gibert, O.; Perez, N.; García-Pablos, A.; Cuadros, M: Hate speech dataset from a white supremacy forum. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), October, Brussels, Belgium (2018)
ElSherief, M.; Nilizadeh, S.; Nguyen, D.; Vigna, G.; Belding, E.: Peer to peer hate: Hate speech instigators and their targets. In: The 12th International AAAI Conference on Web and Social Media (ICWSM-18) June, Stanford, California (2018)
Founta, A.-M.; Djouvas, C.; Chatzakou, D.; Leontiadis, I.; Blackburn, J.; Stringhini, G.; Vakali, A.; Sirivianos, M.; Kourtellis, N.: Large scale crowdsourcing and characterization of twitter abusive behavior. arXiv:1802.00393 (2018)
Qian, J.; Bethke, A.; Liu, Y.; Belding, E.; Wang, W.Y.: A benchmark dataset for learning to intervene in online hate speech. arXiv:1909.04251 (2019)
Saha, P.; Mathew, B.; Goyal, P.; Mukherjee, A.: Hateminers: detecting hate speech against women. arXiv:1812.06700 (2018)
Gomez, R.; Gibert, J.; Gomez, L.; Karatzas, D.: Exploring hate speech detection in multimodal publications. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 1470–1478 (2020)
Burnap, P.; Williams, M.L.: Us and them: identifying cyber hate on twitter across multiple protected characteristics. EPJ Data Sci. 5(1), 11 (2016)
Article Google Scholar
Al-Hassan, A.; Al-Dossari, H.: Detection of hate speech in social networks: a survey on multilingual corpus. In: 6th International Conference on Computer Science and Information Technology (2019)
UNESCO. World Arabic language day
Farghaly, A.: Arabic natural language processing: challenges and solutions. ACM Trans. Asian Lang. Inf. Process. (TALIP) 8(4), 1–22 (2009)
Article Google Scholar
Al-Radaideh, Q..: Applications of mining arabic text: a review. In Recent Trends in Computational Intelligence, IntechOpen (2020)
Abozinadah, E.A.; Mbaziira, A.V.; Jones, J.: Detection of abusive accounts with arabic tweets. Int. J. Knowl. Eng. IACSIT 1(2), 113–119 (2015)
Article Google Scholar
Mubarak, H.; Darwish, K.; Magdy, W.: Abusive language detection on arabic social media. In: Proceedings of the First Workshop on Abusive Language Online, pp. 52–56 (2017)
Haidar, B.; Chamoun, M.; Serhrouchni, A.: A multilingual system for cyberbullying detection: Arabic content detection using machine learning. Adv. Sci. Technol. Eng. Syst. J. 2(6), 275–284 (2017)
Article Google Scholar
Alakrot, A.; Murray, L.; Nikolov, N.S.: Dataset construction for the detection of anti-social behaviour in online communication in arabic. Procedia Comput. Sci. 142, 174–181 (2018)
Article Google Scholar
Haddad, H.; Mulki, H.; Oueslati, A.: T-hsab: a tunisian hate speech and abusive dataset. In: International Conference on Arabic Language Processing, pp. 251–263. Springer (2019)
Darwish, K.; Samih, Y.; Abdelali, A.; Mubarak, H.; Rashed, A.: Arabic offensive language on twitter: analysis and experiments. arXiv:2004.02192 (2020)
Yang, Y.; Cer, D.; Ahmad, A.; Guo, L.J.; Constant, N.A, Gustavo H.; Y.; Steve.; Tar, C., Sung, Y.-H., et al.: Multilingual universal sentence encoder for semantic retrieval. arXiv:1907.04307 (2019)

Download references

Author information

Authors and Affiliations

Jordan University of Science and Technology, Irbid, 22110, Jordan
Rehab Duwairi, Amena Hayajneh & Muhannad Quwaider

Authors

Rehab Duwairi
View author publications
You can also search for this author in PubMed Google Scholar
Amena Hayajneh
View author publications
You can also search for this author in PubMed Google Scholar
Muhannad Quwaider
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rehab Duwairi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duwairi, R., Hayajneh, A. & Quwaider, M. A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets. Arab J Sci Eng 46, 4001–4014 (2021). https://doi.org/10.1007/s13369-021-05383-3

Download citation

Received: 30 August 2020
Accepted: 18 January 2021
Published: 05 February 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s13369-021-05383-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets

Abstract

Access this article

Similar content being viewed by others

Detection of hate speech in Arabic tweets using deep learning

arHateDetector: detection of hate speech from standard and dialectal Arabic Tweets

Levantine hate speech detection in twitter

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets

Abstract

Access this article

Similar content being viewed by others

Detection of hate speech in Arabic tweets using deep learning

arHateDetector: detection of hate speech from standard and dialectal Arabic Tweets

Levantine hate speech detection in twitter

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation