An efficient deep learning-based scheme for web spam detection in IoT environment,Future Generation Computer Systems

当前位置： X-MOL 学术 › Future Gener. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An efficient deep learning-based scheme for web spam detection in IoT environment
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2020-03-04 , DOI: 10.1016/j.future.2020.03.004
Aaisha Makkar , Neeraj Kumar

From the last few years, Internet of Things has revolutionized the entire world. In this, various smart objects perform the tasks of sensing and computing to provide uninterrupted services to the end users in different applications such as smart transportation, e-healthcare to name a few. With the inherent capabilities of these objects to take adaptive intelligent decisions, Cognitive Internet of Things is another paradigm of Internet of Things which emerges during this era. However, while accessing data from the Internet, web spam is one of the challenges to be handled. It has been observed from the literature review that for accessing data, search engines are preferred mostly by an individual. The search engine’s effective ranking can decrease the computational cost of objects during the data access. The current solutions to this issue are aimed to discover the spam in the search engine after its occurrence. So, in this proposal, we present a cognitive spammer framework that removes spam pages when search engines calculate the web page rank score. The framework detects web spam with the support of Long Short-Term Memory network by training the link features. This training resulted with an accuracy of 95.25, as more than 1,11,000 hosts are being correctly classified. However, the content features are trained by neural network. The proposed scheme has been validated with the WEBSPAM-UK 2007 dataset. Prior to processing, the dataset is pre-processed using a new technique called ‘Split by Over-sampling and Train by Under-fitting’. The ensemble and cross validation approach has been used for optimization of results with an accuracy of 96.96%. So, the proposed scheme outperforms the existing techniques.

中文翻译：

IoT环境中基于高效的基于深度学习的Web垃圾邮件检测方案

从最近几年开始，物联网彻底改变了整个世界。在这种情况下，各种智能对象执行感测和计算任务，以在诸如智能交通，电子医疗等不同应用中向最终用户提供不间断的服务。凭借这些对象的固有能力来做出自适应智能决策，认知物联网是该时代出现的另一种物联网范式。但是，在从Internet访问数据时，Web垃圾邮件是要处理的挑战之一。从文献综述中已经观察到，对于访问数据，搜索引擎主要是个人所偏爱的。搜索引擎的有效排名可以降低数据访问期间对象的计算成本。该问题的当前解决方案旨在在垃圾邮件发生后在搜索引擎中发现垃圾邮件。因此，在此建议中，我们提出了一个认知垃圾邮件发送者框架，该框架可在搜索引擎计算网页排名得分时删除垃圾邮件页面。该框架通过训练链接功能，在长短期记忆网络的支持下检测Web垃圾邮件。这次培训的准确性为95.25，因为已正确分类了1,11,000多名主机。但是，内容特征是通过神经网络训练的。拟议的方案已通过WEBSPAM-UK 2007数据集进行了验证。在处理之前，数据集使用称为“过采样分割和欠拟合训练”的新技术进行预处理。集成和交叉验证方法已用于优化结果，准确性为96.96％。所以，

更新日期：2020-03-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文