EGGS: A Flexible Approach to Relational Modeling of Social Network Spam,arXiv - CS - Social and Information Networks

当前位置： X-MOL 学术 › arXiv.cs.SI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

EGGS: A Flexible Approach to Relational Modeling of Social Network Spam
arXiv - CS - Social and Information Networks Pub Date : 2020-01-14 , DOI: arxiv-2001.04909
Jonathan Brophy and Daniel Lowd

Social networking websites face a constant barrage of spam, unwanted messages that distract, annoy, and even defraud honest users. These messages tend to be very short, making them difficult to identify in isolation. Furthermore, spammers disguise their messages to look legitimate, tricking users into clicking on links and tricking spam filters into tolerating their malicious behavior. Thus, some spam filters examine relational structure in the domain, such as connections among users and messages, to better identify deceptive content. However, even when it is used, relational structure is often exploited in an incomplete or ad hoc manner. In this paper, we present Extended Group-based Graphical models for Spam (EGGS), a general-purpose method for classifying spam in online social networks. Rather than labeling each message independently, we group related messages together when they have the same author, the same content, or other domain-specific connections. To reason about related messages, we combine two popular methods: stacked graphical learning (SGL) and probabilistic graphical models (PGM). Both methods capture the idea that messages are more likely to be spammy when related messages are also spammy, but they do so in different ways; SGL uses sequential classifier predictions and PGMs use probabilistic inference. We apply our method to four different social network domains. EGGS is more accurate than an independent model in most experimental settings, especially when the correct label is uncertain. For the PGM implementation, we compare Markov logic networks to probabilistic soft logic and find that both work well with neither one dominating, and the combination of SGL and PGMs usually performs better than either on its own.

中文翻译：

EGGS：一种灵活的社交网络垃圾邮件关系建模方法

社交网站面临着不断涌现的垃圾邮件、不需要的信息，这些信息会分散、惹恼甚至欺骗诚实的用户。这些消息往往很短，因此很难单独识别。此外，垃圾邮件发送者将他们的邮件伪装成合法的，诱骗用户点击链接并诱骗垃圾邮件过滤器容忍他们的恶意行为。因此，一些垃圾邮件过滤器会检查域中的关系结构，例如用户和邮件之间的连接，以更好地识别欺骗性内容。然而，即使使用它，关系结构也经常以不完整或临时的方式被利用。在本文中，我们提出了基于扩展组的垃圾邮件图形模型（EGGS），这是一种用于对在线社交网络中的垃圾邮件进行分类的通用方法。而不是独立地标记每条消息，当相关消息具有相同的作者、相同的内容或其他特定于域的连接时，我们会将它们组合在一起。为了推理相关消息，我们结合了两种流行的方法：堆叠图形学习（SGL）和概率图形模型（PGM）。这两种方法都认为，当相关邮件也是垃圾邮件时，邮件更有可能是垃圾邮件，但它们以不同的方式这样做；SGL 使用顺序分类器预测，而 PGM 使用概率推理。我们将我们的方法应用于四个不同的社交网络域。在大多数实验设置中，EGGS 比独立模型更准确，尤其是当正确的标签不确定时。对于 PGM 实现，我们将马尔可夫逻辑网络与概率软逻辑进行比较，发现两者都运行良好，没有一个占主导地位，

更新日期：2020-01-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>