当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A burst-based unsupervised method for detecting review spammer groups
Information Sciences ( IF 8.1 ) Pub Date : 2020-05-29 , DOI: 10.1016/j.ins.2020.05.084
Shu-juan Ji , Qi Zhang , Jinpeng Li , Dickson K.W. Chiu , Shaohua Xu , Lei Yi , Maoguo Gong

With the development of e-commerce, online shopping has become a part of people's life. As customers often refer to online product reviews for shopping, sellers often collude with review spammers in writing fake reviews to promote or demote target products. In particular, spammers working in groups are more harmful than individual attacks. To detect such spammer groups, previous researchers proposed some frequent item mining based algorithms and graph-based algorithms. In this paper, we propose a method called GSDB (Group Spam Detection algorithm based on review Burst). Our algorithm first locates target products attacked by spammers by detecting the abnormality of product rating distribution. As group spammers usually post many fake reviews within a short period, we design a burst-based algorithm that discovers candidate spammer groups in reviewbursts using the Kernel Density Estimation algorithm. As some innocent reviewers may coincidently review during the burst period, we formulate a variety of individual spam indicators to measure the spamicity of the reviewers to isolate the candidate spammer groups. Finally, we design a series of group spam indicators to measure and classify the spamicity of spammer groups. Experimental results show that our proposed GSDB algorithm outperforms state-of-the-art algorithms.



中文翻译:

基于突发的无监督垃圾邮件发送者组检测方法

随着电子商务的发展,在线购物已成为人们生活的一部分。由于客户经常在购物时参考在线产品评论,因此卖家经常与垃圾评论发送者串通,撰写假评论以促销或降级目标产品。尤其是,与小组一起工作的垃圾邮件发送者比个人攻击更具危害性。为了检测此类垃圾邮件发送者组,以前的研究人员提出了一些基于频繁项目挖掘的算法和基于图的算法。在本文中,我们提出了一种称为GSDB(基于评论突发的组垃圾邮件检测算法)的方法)。我们的算法首先通过检测产品评级分布的异常来定位垃圾邮件发送者攻击的目标产品。由于垃圾邮件发送者通常会在短时间内发布许多虚假评论,因此我们设计了一种基于突发的算法,该算法使用内核密度估计算法在垃圾突发中发现候选垃圾邮件发送者组。由于一些无辜的审阅者可能会在爆发期间同时进行审阅,因此我们制定了各种单独的垃圾邮件指标来衡量审阅者的垃圾邮件隔离度,以隔离候选垃圾邮件发送者群体。最后,我们设计了一系列垃圾邮件指标,以衡量和分类垃圾邮件发送者的垃圾邮件。实验结果表明,我们提出的GSDB算法优于最新算法。

更新日期:2020-05-29
down
wechat
bug