当前位置: X-MOL 学术Kybernetes › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cyberbullying detection from tweets using deep learning
Kybernetes ( IF 2.5 ) Pub Date : 2021-07-13 , DOI: 10.1108/k-01-2021-0061
Shubham Bharti 1 , Arun Kumar Yadav 1 , Mohit Kumar 1 , Divakar Yadav 1
Affiliation  

Purpose

With the rise of social media platforms, an increasing number of cases of cyberbullying has reemerged. Every day, large number of people, especially teenagers, become the victim of cyber abuse. A cyberbullied person can have a long-lasting impact on his mind. Due to it, the victim may develop social anxiety, engage in self-harm, go into depression or in the extreme cases, it may lead to suicide. This paper aims to evaluate various techniques to automatically detect cyberbullying from tweets by using machine learning and deep learning approaches.

Design/methodology/approach

The authors applied machine learning algorithms approach and after analyzing the experimental results, the authors postulated that deep learning algorithms perform better for the task. Word-embedding techniques were used for word representation for our model training. Pre-trained embedding GloVe was used to generate word embedding. Different versions of GloVe were used and their performance was compared. Bi-directional long short-term memory (BLSTM) was used for classification.

Findings

The dataset contains 35,787 labeled tweets. The GloVe840 word embedding technique along with BLSTM provided the best results on the dataset with an accuracy, precision and F1 measure of 92.60%, 96.60% and 94.20%, respectively.

Research limitations/implications

If a word is not present in pre-trained embedding (GloVe), it may be given a random vector representation that may not correspond to the actual meaning of the word. It means that if a word is out of vocabulary (OOV) then it may not be represented suitably which can affect the detection of cyberbullying tweets. The problem may be rectified through the use of character level embedding of words.

Practical implications

The findings of the work may inspire entrepreneurs to leverage the proposed approach to build deployable systems to detect cyberbullying in different contexts such as workplace, school, etc and may also draw the attention of lawmakers and policymakers to create systemic tools to tackle the ills of cyberbullying.

Social implications

Cyberbullying, if effectively detected may save the victims from various psychological problems which, in turn, may lead society to a healthier and more productive life.

Originality/value

The proposed method produced results that outperform the state-of-the-art approaches in detecting cyberbullying from tweets. It uses a large dataset, created by intelligently merging two publicly available datasets. Further, a comprehensive evaluation of the proposed methodology has been presented.



中文翻译:

使用深度学习从推文中检测网络欺凌

目的

随着社交媒体平台的兴起,越来越多的网络欺凌案件重新出现。每天都有大量的人,尤其是青少年成为网络虐待的受害者。受网络欺凌的人会对他的思想产生长期影响。因此,受害者可能会出现社交焦虑、自残、抑郁,或者在极端情况下,可能会导致自杀。本文旨在评估使用机器学习和深度学习方法从推文中自动检测网络欺凌的各种技术。

设计/方法/方法

作者应用机器学习算法方法,在分析实验结果后,作者假设深度学习算法在任务中表现更好。词嵌入技术用于我们的模型训练的词表示。预训练的嵌入 GloVe 用于生成词嵌入。使用了不同版本的 GloVe 并比较了它们的性能。双向长短期记忆(BLSTM)用于分类。

发现

该数据集包含 35,787 条带标签的推文。GloVe840 词嵌入技术和 BLSTM 在数据集上提供了最佳结果,准确率、精度和 F1 度量分别为 92.60%、96.60% 和 94.20%。

研究限制/影响

如果某个词不存在于预训练嵌入 (GloVe) 中,则可能会为其提供一个随机向量表示,该表示可能与该词的实际含义不符。这意味着,如果一个词超出词汇表 (OOV),那么它可能无法适当表示,这会影响网络欺凌推文的检测。可以通过使用字符级嵌入单词来纠正该问题。

实际影响

这项工作的发现可能会激励企业家利用提议的方法来构建可部署的系统来检测不同环境中的网络欺凌,例如工作场所、学校等,也可能引起立法者和政策制定者的注意,以创建系统工具来解决网络欺凌的弊病.

社会影响

网络欺凌如果被有效发现,可以使受害者免受各种心理问题的困扰,进而可以使社会过上更健康、更有成效的生活。

原创性/价值

所提出的方法产生的结果优于从推文中检测网络欺凌的最先进方法。它使用通过智能合并两个公开可用的数据集创建的大型数据集。此外,还介绍了对拟议方法的综合评估。

更新日期:2021-07-12
down
wechat
bug