DTOF-ANN: An Artificial Neural Network phishing detection model based on Decision Tree and Optimal Features,Applied Soft Computing

当前位置： X-MOL 学术 › Appl. Soft Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

DTOF-ANN: An Artificial Neural Network phishing detection model based on Decision Tree and Optimal Features
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-06-30 , DOI: 10.1016/j.asoc.2020.106505
Erzhou Zhu , Yinyin Ju , Zhile Chen , Feng Liu , Xianyong Fang

Recently, phishing emerges as one of the biggest threats to human’s daily networking environments. Phishing attackers disguise illegal URLs as normal ones to steal user’s private information with the social engineering techniques, such as emails and SMS, which calls for an effective method of preventing phishing attacks to relieve the loss by them. Neural networks can be used to detect and prevent phishing attacks because of their strong active learning abilities from massive datasets and high accuracy in data classification. However, duplicate points in the public datasets and negative and useless features in the feature vectors will trap the training of the neural networks into the problem of over-fitting, which will make the trained classifier weak when detect phishing websites. This paper proposes DTOF-ANN (Decision Tree and Optimal Features based Artificial Neural Network) to tackle this shortcoming, which is a neural-network phishing detection model based on decision tree and optimal feature selection. First, the traditional K-medoids clustering algorithm is improved with an incremental selection of initial centers to remove the duplicate points from the public datasets. Then, an optimal feature selection algorithm based on the new defined feature evaluation index, decision tree and local search method is designed to prune out the negative and useless features. Finally, the optimal structure of the neural network classifier is constructed through properly adjusting parameters and trained by the selected optimal features. Experimental results have demonstrated that DTOF-ANN exhibits higher performance than many of the existing methods.

中文翻译：

DTOF-ANN：基于决策树和最优特征的人工神经网络网络钓鱼检测模型

最近，网络钓鱼已成为对人类日常网络环境的最大威胁之一。网络钓鱼攻击者将非法URL伪装为普通URL，利用诸如电子邮件和SMS之类的社会工程技术来窃取用户的私人信息，这要求一种有效的方法来防止网络钓鱼攻击以减轻其损失。神经网络可以从大量数据集中获得强大的主动学习能力，并且数据分类的准确性高，因此可以用于检测和预防网络钓鱼攻击。但是，公共数据集中的重复点以及特征向量中的消极和无用特征将使神经网络的训练陷入过度拟合的问题，这将使训练有素的分类器在检测网络钓鱼网站时较弱。针对这种缺陷，本文提出了一种基于决策树和最优特征的神经网络网络钓鱼检测模型——DTOF-ANN。首先，通过增量选择初始中心来改进传统的K-medoids聚类算法，以从公共数据集中删除重复的点。然后，设计了一种基于新定义的特征评价指标，决策树和局部搜索方法的最优特征选择算法，以消除负面特征和无用特征。最后，通过适当地调整参数来构造神经网络分类器的最优结构，并通过选择的最优特征对其进行训练。

更新日期：2020-06-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>