Applying machine learning and natural language processing to detect phishing email,Computers & Security

当前位置： X-MOL 学术 › Comput. Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Applying machine learning and natural language processing to detect phishing email
Computers & Security ( IF 4.8 ) Pub Date : 2021-07-22 , DOI: 10.1016/j.cose.2021.102414
Areej Alhogail ₁ , Afrah Alsabih ₁

Affiliation

The growth of online services has been accompanied by increased growth in cyber-attacks. One of the most common effective attacks is phishing, in which attempts are made to steal confidential information by impersonating a legitimate source. The success of phishing emails is based on manipulating human emotions, which leads to concerns and creates an urgent situation by claiming that the recipient should take quick action that may cause great financial and data losses. Therefore, we cannot rely solely on humans to detect phishing, and more effective and automatic phishing detection mechanisms are required. Many detectors have been proposed; however, the high number of phishing emails urges additional effort. Hence, in this study, we propose a phishing email classifier model that applies deep learning algorithms using a graph convolutional network (GCN) and natural language processing over an email body text to improve phishing detection accuracy. The literature has proved GCN success in text classification, and this study proved its success in improving the accuracy of email phishing detection. The classifier was tested in a supervised learning approach. Experimental tests verified that the classifier was effective in detecting phishing emails using body text among the existing detection methods, and it took short time and produced a high accuracy rate of 98.2% and a low false-positive rate of 0.015.

中文翻译：

应用机器学习和自然语言处理来检测网络钓鱼电子邮件

在线服务的增长伴随着网络攻击的增长。最常见的有效攻击之一是网络钓鱼，其中试图通过冒充合法来源窃取机密信息。网络钓鱼电子邮件的成功基于操纵人类情绪，这会导致担忧并通过声称收件人应采取可能导致巨大财务和数据损失的快速行动来制造紧急情况。因此，我们不能仅仅依靠人类来检测网络钓鱼，还需要更有效和自动化的网络钓鱼检测机制。已经提出了许多检测器；然而，大量的网络钓鱼电子邮件需要额外的努力。因此，在本研究中，我们提出了一种网络钓鱼电子邮件分类器模型，该模型使用图卷积网络 (GCN) 和自然语言处理对电子邮件正文应用深度学习算法，以提高网络钓鱼检测的准确性。文献证明了 GCN 在文本分类方面的成功，本研究证明了其在提高电子邮件网络钓鱼检测准确性方面的成功。分类器在监督学习方法中进行了测试。实验测试验证了该分类器在现有检测方法中能够有效地检测使用正文文本的网络钓鱼邮件，且耗时短，准确率高达98.2%，误报率低至0.015。这项研究证明了它在提高电子邮件网络钓鱼检测的准确性方面取得了成功。分类器在监督学习方法中进行了测试。实验测试验证了该分类器在现有检测方法中能够有效地检测使用正文文本的网络钓鱼邮件，且耗时短，准确率高达98.2%，误报率低至0.015。这项研究证明了它在提高电子邮件网络钓鱼检测的准确性方面取得了成功。分类器在监督学习方法中进行了测试。实验测试验证了该分类器在现有检测方法中能够有效地检测使用正文文本的网络钓鱼邮件，且耗时短，准确率高达98.2%，误报率低至0.015。

更新日期：2021-07-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11