Spam review detection using self-organizing maps and convolutional neural networks,Computers & Security

当前位置： X-MOL 学术 › Comput. Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Spam review detection using self-organizing maps and convolutional neural networks
Computers & Security ( IF 4.8 ) Pub Date : 2021-04-17 , DOI: 10.1016/j.cose.2021.102274
Ashraf Neisari , Luis Rueda , Sherif Saad

Online public reviews have significant influenced customers who purchase products or seek services. Fake reviews are posted online to promote or demote targeted products or reputation of the organizations and businesses. Spam review detection has been the focus of many researchers in recent years. As the online services have been growing rapidly, the importance of the issue is ever increasing and needs to be addressed properly. In this regard, there is a variety of approaches that have been introduced to distinguish truthful reviews from the fake ones. The main features engineered in the past studies typically involve two types of linguistic-based and behavioral-based characteristics of the reviews. Unsupervised, supervised and semi-supervised machine learning methods have been widely utilized to perform such a classification. This paper introduces a novel approach to detect fake reviews from the genuine ones using linguistic features. Unsupervised learning via self-organizing maps (SOM) in conjunction with a convolutional neural networks (CNN) are employed to perform classification of the reviews. We transform the reviews into images by arranging semantically-similar words around a pixel of the image or equivalently a SOM grid cell. The resulting review images are consequently fed to the CNN for supervised training and then classification. Comprehensive tests on two gold-standard datasets show the effectiveness of the proposed method on single and multi-domain contexts with accuracy of 88% and 87%, respectively.

中文翻译：

使用自组织映射和卷积神经网络进行垃圾邮件审阅检测

在线公众评论对购买产品或寻求服务的客户产生了重大影响。虚假评论会在线发布，以促销或降级目标产品或组织和企业的声誉。近年来，垃圾邮件审查检测已成为许多研究人员关注的焦点。随着在线服务的迅速发展，此问题的重要性不断提高，需要适当解决。在这方面，已经引入了多种方法来区分真实评论和假评论。过去研究中设计的主要功能通常涉及评论的基于语言和行为的两种类型。无监督，监督和半监督机器学习方法已被广泛用于执行这种分类。本文介绍了一种新颖的方法，可以使用语言功能从真实评论中检测出虚假评论。通过自组织映射（SOM）结合卷积神经网络（CNN）进行无监督学习，以对评论进行分类。我们通过在图像的像素或等效的SOM网格单元周围排列语义相似的单词，将评论转换为图像。因此，将生成的审阅图像输入到CNN进行有监督的训练，然后进行分类。对两个黄金标准数据集的综合测试表明，该方法在单域和多域上下文中的有效性分别为88％和87％。通过自组织映射（SOM）结合卷积神经网络（CNN）进行无监督学习，以对评论进行分类。我们通过在图像的像素或等效的SOM网格单元周围排列语义相似的单词，将评论转换为图像。因此，将生成的审阅图像输入到CNN进行有监督的训练，然后进行分类。对两个金标准数据集的综合测试表明，该方法在单域和多域上下文中的有效性分别为88％和87％。通过自组织映射（SOM）结合卷积神经网络（CNN）进行无监督学习，以对评论进行分类。我们通过在图像的像素或等效的SOM网格单元周围排列语义相似的单词，将评论转换为图像。因此，将生成的审阅图像输入到CNN进行有监督的训练，然后进行分类。对两个黄金标准数据集的综合测试表明，该方法在单域和多域上下文中的有效性分别为88％和87％。因此，将生成的审阅图像输入到CNN进行有监督的训练，然后进行分类。对两个黄金标准数据集的综合测试表明，该方法在单域和多域上下文中的有效性分别为88％和87％。因此，将生成的审阅图像输入到CNN进行有监督的训练，然后进行分类。对两个黄金标准数据集的综合测试表明，该方法在单域和多域上下文中的有效性分别为88％和87％。

更新日期：2021-05-03

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11