An optimization-based deep belief network for the detection of phishing e-mails,Data Technologies and Applications

当前位置： X-MOL 学术 › Data Technol. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An optimization-based deep belief network for the detection of phishing e-mails
Data Technologies and Applications ( IF 1.6 ) Pub Date : 2020-07-16 , DOI: 10.1108/dta-02-2020-0043
Arshey M. , Angel Viji K. S.

Purpose

Phishing is a serious cybersecurity problem, which is widely available through multimedia, such as e-mail and Short Messaging Service (SMS) to collect the personal information of the individual. However, the rapid growth of the unsolicited and unwanted information needs to be addressed, raising the necessity of the technology to develop any effective anti-phishing methods.

Design/methodology/approach

The primary intention of this research is to design and develop an approach for preventing phishing by proposing an optimization algorithm. The proposed approach involves four steps, namely preprocessing, feature extraction, feature selection and classification, for dealing with phishing e-mails. Initially, the input data set is subjected to the preprocessing, which removes stop words and stemming in the data and the preprocessed output is given to the feature extraction process. By extracting keyword frequency from the preprocessed, the important words are selected as the features. Then, the feature selection process is carried out using the Bhattacharya distance such that only the significant features that can aid the classification are selected. Using the selected features, the classification is done using the deep belief network (DBN) that is trained using the proposed fractional-earthworm optimization algorithm (EWA). The proposed fractional-EWA is designed by the integration of EWA and fractional calculus to determine the weights in the DBN optimally.

Findings

The accuracy of the methods, naive Bayes (NB), DBN, neural network (NN), EWA-DBN and fractional EWA-DBN is 0.5333, 0.5455, 0.5556, 0.5714 and 0.8571, respectively. The sensitivity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.4558, 0.5631, 0.7035, 0.7045 and 0.8182, respectively. Likewise, the specificity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.5052, 0.5631, 0.7028, 0.7040 and 0.8800, respectively. It is clear from the comparative table that the proposed method acquired the maximal accuracy, sensitivity and specificity compared with the existing methods.

Originality/value

The e-mail phishing detection is performed in this paper using the optimization-based deep learning networks. The e-mails include a number of unwanted messages that are to be detected in order to avoid the storage issues. The importance of the method is that the inclusion of the historical data in the detection process enhances the accuracy of detection.

中文翻译：

基于优化的深度信任网络，用于检测网络钓鱼电子邮件

目的

网络钓鱼是一个严重的网络安全问题，可以通过多媒体（例如电子邮件和短消息服务（SMS））广泛使用以收集个人信息。但是，需要解决未经请求和不需要的信息迅速增长的问题，这增加了开发任何有效的反网络钓鱼方法的技术的必要性。

设计/方法/方法

这项研究的主要目的是设计和开发一种通过提出优化算法来防止网络钓鱼的方法。所提出的方法涉及四个步骤，即处理钓鱼电子邮件的预处理，特征提取，特征选择和分类。最初，对输入数据集进行预处理，这将除去停用词并阻止数据中的词干，并将预处理后的输出提供给特征提取过程。通过从预处理中提取关键词频率，选择重要词作为特征。然后，使用Bhattacharya距离执行特征选择过程，以便仅选择可以帮助分类的重要特征。使用所选功能，分类是使用深度信念网络（DBN）进行的，该深度信念网络是使用拟议的分数ear优化算法（EWA）训练的。提议的分数EWA是通过将EWA和分数演算集成而设计的，以最优方式确定DBN中的权重。

发现

朴素贝叶斯（NB），DBN，神经网络（NN），EWA-DBN和分数EWA-DBN方法的准确性分别为0.5333、0.5455、0.5556、0.5714和0.8571。方法NB，DBN，NN，EWA-DBN和分数EWA-DBN的灵敏度分别为0.4558、0.5631、0.7035、0.7045和0.8182。同样，方法NB，DBN，NN，EWA-DBN和分数EWA-DBN的特异性分别为0.5052、0.5631、0.7028、0.7040和0.8800。从比较表中可以明显看出，与现有方法相比，所提出的方法获得了最大的准确性，灵敏度和特异性。