Automatic detection of phishing pages with event-based request processing, deep-hybrid feature extraction and light gradient boosted machine model,Telecommunication Systems

当前位置： X-MOL 学术 › Telecommun. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automatic detection of phishing pages with event-based request processing, deep-hybrid feature extraction and light gradient boosted machine model
Telecommunication Systems ( IF 1.7 ) Pub Date : 2021-05-19 , DOI: 10.1007/s11235-021-00799-6
Ömer Kasim

Cyber attackers target unconscious users with phishing methods is a serious threat to cyber security. It is important to quickly detect benign web pages according to legitimate ones. Despite the successful detection of phishing in the studies suggested in the literature, the problems of high false positive rate after the web page request is processed should be resolved. The novelty of the study is that classification of deep-hybrid features with the Light Gradient Boosted Machine model is evaluated as an event when the web address is entered on the address bar of the browser. Thus, phishing can be detected at every request entry before the process is completed. In the proposed approach, normalized features from requests of web pages are applied to Sparse Autoencoder and Principal Component Analysis methods. These methods contribute to encoding of the deep-hybrid feature extraction. Light Gradient Boosted Machine model classifier can effectively distinguish legitimate pages and phishing attacks using these features. The ISCX-URL phishing dataset is used to measure performance of the proposed approach and validate it. The proposed method classifies the features that are encoded with SAE-PCA by using the Light Gradient Boosted Machine model at the rate of 99.6% within the event. The obtained results show that the proposed approach performs better classification performance metrics than most others. This accuracy contributed to the solution of the false-positives problem before requests are processed compared to other models.

中文翻译：

通过基于事件的请求处理，深度混合特征提取和光梯度增强机器模型自动检测网络钓鱼页面

网络攻击者使用网络钓鱼方法将无意识的用户作为目标，这是对网络安全的严重威胁。快速根据合法网页检测良性网页非常重要。尽管在文献中建议的研究中成功检测到网络钓鱼，但仍应解决在处理网页请求后误报率高的问题。这项研究的新颖之处在于，当在浏览器的地址栏上输入网址时，将评估使用Light Gradient Boosted Machine模型进行的深层混合特征分类。因此，可以在处理完成之前在每个请求条目处检测网络钓鱼。在所提出的方法中，将来自网页请求的归一化特征应用于稀疏自动编码器和主成分分析方法。这些方法有助于深度混合特征提取的编码。轻型梯度增强计算机模型分类器可以使用这些功能有效区分合法页面和网络钓鱼攻击。ISCX-URL网络钓鱼数据集用于衡量所提出方法的性能并对其进行验证。所提出的方法通过在事件内以99.6％的比率使用Light Gradient Boosted Machine模型对使用SAE-PCA编码的特征进行分类。获得的结果表明，所提出的方法比大多数其他方法具有更好的分类性能指标。与其他模型相比，这种准确性有助于在处理请求之前解决假阳性问题。轻型梯度增强计算机模型分类器可以使用这些功能有效区分合法页面和网络钓鱼攻击。ISCX-URL网络钓鱼数据集用于衡量所提出方法的性能并对其进行验证。所提出的方法通过在事件内以99.6％的比率使用Light Gradient Boosted Machine模型对使用SAE-PCA编码的特征进行分类。获得的结果表明，所提出的方法比大多数其他方法具有更好的分类性能指标。与其他模型相比，这种准确性有助于在处理请求之前解决假阳性问题。轻型梯度增强计算机模型分类器可以使用这些功能有效区分合法页面和网络钓鱼攻击。ISCX-URL网络钓鱼数据集用于衡量所提出方法的性能并对其进行验证。所提出的方法通过在事件内以99.6％的比率使用Light Gradient Boosted Machine模型对使用SAE-PCA编码的特征进行分类。获得的结果表明，所提出的方法比大多数其他方法具有更好的分类性能指标。与其他模型相比，这种准确性有助于在处理请求之前解决假阳性问题。所提出的方法通过在事件内以99.6％的比率使用Light Gradient Boosted Machine模型对使用SAE-PCA编码的特征进行分类。获得的结果表明，所提出的方法比大多数其他方法具有更好的分类性能指标。与其他模型相比，这种准确性有助于在处理请求之前解决假阳性问题。所提出的方法通过在事件内以99.6％的比率使用Light Gradient Boosted Machine模型对使用SAE-PCA编码的特征进行分类。获得的结果表明，所提出的方法比大多数其他方法具有更好的分类性能指标。与其他模型相比，这种准确性有助于在处理请求之前解决假阳性问题。

更新日期：2021-05-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11