A content and URL analysis-based efficient approach to detect smishing SMS in intelligent systems,International Journal of Intelligent Systems

当前位置： X-MOL 学术 › Int. J. Intell. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A content and URL analysis-based efficient approach to detect smishing SMS in intelligent systems
International Journal of Intelligent Systems ( IF 7 ) Pub Date : 2022-09-06 , DOI: 10.1002/int.23035
Ankit K. Jain, Brij B. Gupta, Kamaljeet Kaur, Piyush Bhutani, Wadee Alhalabi, Ammar Almomani

Smishing is a combined form of short message service (SMS) and phishing in which a malicious text message or SMS is sent to mobile users. This form of attack has come to be a severe cyber-security difficulty and has triggered incredible monetary losses to the victims. Many antismishing solutions for mobile devices have been proposed till date but still, there is a lack of a full-fledged solution. Therefore, this paper proposes an efficient approach that analyzes text content and uniform resource locator (URL) presented in the SMS. We have integrated the URL phishing classifier with the text classifier to improve accuracy as some of the SMS contain the URL with no text or much less text. To find out rare words in a report, depending upon the frequency of term (TF) and the reciprocal of document frequency TF-inverse document frequency (IDF), a weighting framework TF-IDF is used. We have used two data sets for both text as well as for URL phishing classifier and used a synthetic minority oversampling technique to balance the training data. The voting classifier simply merges the findings of each classifier passed into it and predicts the output on the basis of voting. In proposed approach integrating KNN, RF, and ETC can detect smishing messages with a 99.03% accuracy and 98.94% precision rate which is relatively efficient compared with existing ones like SmiDCA model which has the given accuracy of 96.40% using Random Forest classifier in BFSA, Feature-Based it has an accuracy of 98.74% and 94.20% true positive rate and Smishing Detector it shows an overall accuracy of 96.29%.

中文翻译：

一种基于内容和 URL 分析的智能系统中检测诈骗短信的有效方法

网络钓鱼是短消息服务 (SMS) 和网络钓鱼的组合形式，其中向移动用户发送恶意文本消息或 SMS。这种形式的攻击已成为严重的网络安全问题，并给受害者造成了难以置信的经济损失。迄今为止，已经提出了许多针对移动设备的反钓鱼解决方案，但仍然缺乏成熟的解决方案。因此，本文提出了一种分析短信中文本内容和统一资源定位符 (URL) 的有效方法。我们将 URL 网络钓鱼分类器与文本分类器集成在一起，以提高准确性，因为某些 SMS 包含没有文本或更少文本的 URL。找出报告中的稀有词，取决于词频（TF）和文档频率TF-逆文档频率（IDF）的倒数，使用加权框架 TF-IDF。我们为文本和 URL 网络钓鱼分类器使用了两个数据集，并使用合成少数过采样技术来平衡训练数据。投票分类器简单地将传递给它的每个分类器的结果合并，并在投票的基础上预测输出。在提出的集成 KNN、RF 和 ETC 的方法中，可以以 99.03% 的准确率和 98.94% 的准确率检测短信，这与现有的方法（如 SmiDCA 模型相比，在 BFSA 中使用随机森林分类器具有 96.40% 的给定准确率）相对有效，基于特征的它具有 98.74% 的准确度和 94.20% 的真阳性率，而 Smishing Detector 显示出 96.29% 的整体准确度。我们为文本和 URL 网络钓鱼分类器使用了两个数据集，并使用合成少数过采样技术来平衡训练数据。投票分类器简单地将传递给它的每个分类器的结果合并，并在投票的基础上预测输出。在提出的集成 KNN、RF 和 ETC 的方法中，可以以 99.03% 的准确率和 98.94% 的准确率检测短信，这与现有的方法（如 SmiDCA 模型相比，在 BFSA 中使用随机森林分类器具有 96.40% 的给定准确率）相对有效，基于特征的它具有 98.74% 的准确度和 94.20% 的真阳性率，而 Smishing Detector 显示出 96.29% 的整体准确度。我们为文本和 URL 网络钓鱼分类器使用了两个数据集，并使用合成少数过采样技术来平衡训练数据。投票分类器简单地将传递给它的每个分类器的结果合并，并在投票的基础上预测输出。在提出的集成 KNN、RF 和 ETC 的方法中，可以以 99.03% 的准确率和 98.94% 的准确率检测短信，这与现有的方法（如 SmiDCA 模型相比，在 BFSA 中使用随机森林分类器具有 96.40% 的给定准确率）相对有效，基于特征的它具有 98.74% 的准确度和 94.20% 的真阳性率，而 Smishing Detector 显示出 96.29% 的整体准确度。投票分类器简单地将传递给它的每个分类器的结果合并，并在投票的基础上预测输出。在提出的集成 KNN、RF 和 ETC 的方法中，可以以 99.03% 的准确率和 98.94% 的准确率检测短信，这与现有的方法（如 SmiDCA 模型相比，在 BFSA 中使用随机森林分类器具有 96.40% 的给定准确率）相对有效，基于特征的它具有 98.74% 的准确度和 94.20% 的真阳性率，而 Smishing Detector 显示出 96.29% 的整体准确度。投票分类器简单地将传递给它的每个分类器的结果合并，并在投票的基础上预测输出。在提出的集成 KNN、RF 和 ETC 的方法中，可以以 99.03% 的准确率和 98.94% 的准确率检测短信，这与现有的方法（如 SmiDCA 模型相比，在 BFSA 中使用随机森林分类器具有 96.40% 的给定准确率）相对有效，基于特征的它具有 98.74% 的准确度和 94.20% 的真阳性率，而 Smishing Detector 显示出 96.29% 的整体准确度。

更新日期：2022-09-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>