Skip to main content
Log in

CyberBERT: BERT for cyberbullying identification

BERT for cyberbullying identification

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Cyberbullying can be delineated as a purposive and recurrent act, which is aggressive in nature, done via different social media platforms such as Facebook, Twitter, Instagram, and others. A state-of-the-art pre-training language model, BERT (Bidirectional Encoder Representations from Transformers), has achieved remarkable results in many language understanding tasks. In this paper, we present a novel application of BERT for cyberbullying identification. A straightforward classification model using BERT is able to achieve the state-of-the-art results across three real-world corpora: Formspring (\(\sim 12\hbox {k}\) posts), Twitter (\(\sim 16\hbox {k}\) posts), and Wikipedia (\(\sim 100\hbox {k}\) posts). Experimental results demonstrate that our proposed model achieves significant improvements over existing works, in comparison with the slot-gated or attention-based deep neural network models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://cyberbullying.org/.

References

  1. Peter, K.S., et al.: Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 49(4), 376–385 (2008)

    Article  Google Scholar 

  2. Devlin, J., Chang, M.W., Lee, K., Toutanova K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: arXiv preprint arXiv:1810.04805 (2018)

  3. Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)

  4. Reynolds, K., Kontostathis, A., Edwards, L.: Using machine learning to detect cyberbullying. Int. Conf. Mach. Learn. Appl. Workshop 2, 241–244 (2011)

    Google Scholar 

  5. Djuric, N., et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web. pp. 29–30 (2015)

  6. Badjatiya, P., et al.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)

  7. Balakrishnan, V., Khan, S., Arabnia, H.R.: Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Comput. Sec. 90, 101710 (2020)

    Article  Google Scholar 

  8. Raisi, E., Huang, B.: Cyberbullying identification using participant-vocabulary consistency. In: arXiv preprint arXiv:1606.08084 (2016)

  9. Squicciarini, A., et al.: Identification and characterization of cyberbullying dynamics in an online social network. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 280–285 (2015)

  10. Aggarwal, A., et al.: Classification of fake news by fine-tuning deep bidirectional transformers based language model. EAI Endorsed Transactions on Scalable Information Systems Online First. EAI, Ghent (2020)

    Google Scholar 

  11. Lee, J., et al.: BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)

    Google Scholar 

  12. Sergio, G.C., Lee, M.: Stacked DeBERT: All Attention in Incomplete Data for Text Classification. In: arXiv preprint arXiv:2001.00137 (2020)

  13. Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. International Conference on Complex Networks and Their Applications, pp. 928–940. Springer, Berlin (2019)

    Google Scholar 

  14. Pavlopoulos, J., et al.: Convai at semeval-2019 task 6: Offensive language identification and categorization with perspective and bert. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 571–576 (2019)

  15. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp. 88–93 (2016)

  16. Wulczyn, E., Thain, N., Dixon, L.: Ex machina: Personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1391–1399 (2017)

  17. Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  18. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: arXiv preprint arXiv:1503.02531 (2015)

  19. Agrawal, S., Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. European Conference on Information Retrieval, pp. 141–153. Springer, Berlin (2018)

    Google Scholar 

  20. Dietteric, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)

    Article  Google Scholar 

  21. Nuzzo, R.: Scientific method: Statistical errors. Nat. News 5067487(487), 150 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

Dr. Sriparna Saha gratefully acknowledges the Young Faculty Research Fellowship (YFRF) Award, supported by Visvesvaraya Ph.D. Scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly Media Lab Asia) for carrying out this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sriparna Saha.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paul, S., Saha, S. CyberBERT: BERT for cyberbullying identification. Multimedia Systems 28, 1897–1904 (2022). https://doi.org/10.1007/s00530-020-00710-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-020-00710-4

Keywords

Navigation