A multi-view attention-based deep learning system for online deviant content detection,World Wide Web

当前位置： X-MOL 学术 › World Wide Web › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A multi-view attention-based deep learning system for online deviant content detection
World Wide Web ( IF 2.7 ) Pub Date : 2020-09-30 , DOI: 10.1007/s11280-020-00840-9
Yunji Liang , Bin Guo , Zhiwen Yu , Xiaolong Zheng , Zhu Wang , Lei Tang

With the exponential growth of user-generated content, policies and guidelines are not always enforced in social media, resulting in the prevalence of deviant content violating policies and guidelines. The adverse effects of deviant content are devastating and far-reaching. However, the detection of deviant content from sparse and imbalanced textual data is challenging, as a large number of stakeholders are involved with different stands and the subtle linguistic cues are highly dependent on complex context. To address this problem, we propose a multi-view attention-based deep learning system, which combines random subspace and binary particle swarm optimization (RS-BPSO) to distill content of interest (candidates) from imbalanced data, and applies the context and view attention mechanisms in convolutional neural network (dubbed as SSCNN) for the extraction of structural and semantic features. We evaluate the proposed approach on a large-scale dataset collected from Facebook, and find that RS-BPSO is able to detect whether the content is associated with marijuana with an accuracy of 87.55%, and SSCNN outperforms baselines with an accuracy of 94.50%.

中文翻译：

基于多视角关注的深度学习系统，用于在线异常内容检测

随着用户生成内容的指数级增长，社交媒体中并不总是强制执行策略和准则，从而导致违背策略和准则的异常内容盛行。异常内容的负面影响是毁灭性的和深远的。但是，从稀疏和不平衡的文本数据中检测异常内容具有挑战性，因为大量利益相关者参与了不同的立场，并且微妙的语言暗示高度依赖于复杂的上下文。为了解决这个问题，我们提出了一种基于多视角注意力的深度学习系统，该系统结合了随机子空间和二进制粒子群优化（RS-BPSO）来从不平衡数据中提取感兴趣的内容（候选），并在卷积神经网络（称为SSCNN）中应用上下文和视图注意机制来提取结构和语义特征。我们在从Facebook收集的大规模数据集上评估了所提出的方法，发现RS-BPSO能够以87.55％的准确性检测内容是否与大麻相关，而SSCNN则以94.50％的准确性优于基线。

更新日期：2020-09-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文