当前位置: X-MOL 学术Multimedia Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multimodal cyberbullying detection using capsule network with dynamic routing and deep convolutional neural network
Multimedia Systems ( IF 3.5 ) Pub Date : 2021-02-02 , DOI: 10.1007/s00530-020-00747-5
Akshi Kumar , Nitin Sachdeva

Cyberbullying is the use of information technology networks by individuals’ to humiliate, tease, embarrass, taunt, defame and disparage a target without any face-to-face contact. Social media is the 'virtual playground' used by bullies with the upsurge of social networking sites such as Facebook, Instagram, YouTube and Twitter. It is critical to implement models and systems for automatic detection and resolution of bullying content available online as the ramifications can lead to a societal epidemic. This paper presents a deep neural model for cyberbullying detection in three different modalities of social data, namely textual, visual and info-graphic (text embedded along with an image). The all-in-one architecture, CapsNet–ConvNet, consists of a capsule network (CapsNet) deep neural network with dynamic routing for predicting the textual bullying content and a convolution neural network (ConvNet) for predicting the visual bullying content. The info-graphic content is discretized by separating text from the image using Google Lens of Google Photos app. The perceptron-based decision-level late fusion strategy for multimodal learning is used to dynamically combine the predictions of discrete modalities and output the final category as bullying or non-bullying type. Experimental evaluation is done on a mix-modal dataset which contains 10,000 comments and posts scrapped from YouTube, Instagram and Twitter. The proposed model achieves a superlative performance with the AUC–ROC of 0.98.



网络欺凌是指个人使用信息技术网络来羞辱,取笑,尴尬,嘲讽,诽谤和贬低目标,而无需面对面接触。社交媒体是欺凌者使用的“虚拟游乐场”,而社交网站如Facebook,Instagram,YouTube和Twitter则在上升。实施模型和系统以自动检测和解决在线提供的欺凌内容至关重要,因为后果可能导致社会流行。本文提出了一种深层神经模型,用于在三种不同形式的社交数据(即文本,视觉和信息图形(与图像一起嵌入的文本))中进行网络欺凌检测。CapsNet-ConvNet一体化架构 由一个具有动态路由的胶囊网络(CapsNet)深层神经网络(用于预测文本欺凌内容)和一个用于预测视觉欺负内容的卷积神经网络(ConvNet)组成。通过使用Google Photos应用程序的Google Lens将图像中的文本与文本分离,可以使信息图形内容离散化。用于多模式学习的基于感知器的决策级后期融合策略用于动态组合离散模式的预测,并将最终类别输出为欺凌或非欺凌类型。在混合模式数据集上进行实验评估,该数据集包含10,000条评论和从YouTube,Instagram和Twitter删除的帖子。所提出的模型以0.98的AUC-ROC达到了最高的性能。
