An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing,Cognitive Computation

当前位置： X-MOL 学术 › Cognit. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing
Cognitive Computation ( IF 5.4 ) Pub Date : 2021-02-16 , DOI: 10.1007/s12559-021-09845-6
Oscar Araque , Carlos A. Iglesias

The dramatic growth of the Web has motivated researchers to extract knowledge from enormous repositories and to exploit the knowledge in myriad applications. In this study, we focus on natural language processing (NLP) and, more concretely, the emerging field of affective computing to explore the automation of understanding human emotions from texts. This paper continues previous efforts to utilize and adapt affective techniques into different areas to gain new insights. This paper proposes two novel feature extraction methods that use the previous sentic computing resources AffectiveSpace and SenticNet. These methods are efficient approaches for extracting affect-aware representations from text. In addition, this paper presents a machine learning framework using an ensemble of different features to improve the overall classification performance. Following the description of this approach, we also study the effects of known feature extraction methods such as TF-IDF and SIMilarity-based sentiment projectiON (SIMON). We perform a thorough evaluation of the proposed features across five different datasets that cover radicalization and hate speech detection tasks. To compare the different approaches fairly, we conducted a statistical test that ranks the studied methods. The obtained results indicate that combining affect-aware features with the studied textual representations effectively improves performance. We also propose a criterion considering both classification performance and computational complexity to select among the different methods.

中文翻译：

基于Sentic计算的在线化和仇恨语音检测的集成方法。

Web的迅猛发展促使研究人员从大量存储库中提取知识，并在各种应用程序中利用这些知识。在这项研究中，我们专注于自然语言处理（NLP），更具体地说，是情感计算的新兴领域，以探索从文本中理解人类情感的自动化。本文继续了先前的努力，将情感技术运用到不同的领域，并获得了新的见解。本文提出了两种新颖的特征提取方法，它们使用了以前的感觉计算资源AffectiveSpace和SenticNet。这些方法是从文本中提取情感感知表示的有效方法。此外，本文提出了一种使用不同功能的集合来提高整体分类性能的机器学习框架。按照这种方法的描述，我们还研究了已知特征提取方法的效果，例如TF-IDF和基于相似度的情感投影（SIMON）。我们在涵盖激进和仇恨语音检测任务的五个不同数据集中对提议的功能进行了全面的评估。为了公平地比较不同的方法，我们进行了统计测试，对研究的方法进行了排名。获得的结果表明，将情感意识特征与研究的文本表示相结合可以有效地提高性能。我们还提出了同时考虑分类性能和计算复杂度的准则，以在不同方法中进行选择。我们还研究了已知特征提取方法（例如TF-IDF和基于相似度的情感投射（SIMON））的效果。我们在涵盖激进和仇恨语音检测任务的五个不同数据集中对提议的功能进行了全面的评估。为了公平地比较不同的方法，我们进行了统计测试，对研究方法进行了排名。获得的结果表明，将情感意识特征与研究的文本表示相结合可以有效地提高性能。我们还提出了同时考虑分类性能和计算复杂度的准则，以在不同方法中进行选择。我们还研究了已知特征提取方法（例如TF-IDF和基于相似度的情感投射（SIMON））的效果。我们在涵盖激进和仇恨语音检测任务的五个不同数据集中对提议的功能进行了全面的评估。为了公平地比较不同的方法，我们进行了统计测试，对研究方法进行了排名。获得的结果表明，将情感意识特征与研究的文本表示相结合可以有效地提高性能。我们还提出了同时考虑分类性能和计算复杂度的准则，以在不同方法中进行选择。为了公平地比较不同的方法，我们进行了统计测试，对研究方法进行了排名。获得的结果表明，将情感意识特征与研究的文本表示相结合可以有效地提高性能。我们还提出了同时考虑分类性能和计算复杂度的准则，以在不同方法中进行选择。为了公平地比较不同的方法，我们进行了统计测试，对研究方法进行了排名。获得的结果表明，将情感意识特征与研究的文本表示相结合可以有效地提高性能。我们还提出了同时考虑分类性能和计算复杂度的准则，以在不同方法中进行选择。

更新日期：2021-02-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>