Abstract
Phishing is a fraudulent practice and a form of cyber-attack designed and executed with the sole purpose of gathering sensitive information by masquerading the genuine websites. Phishers fool users by replicating the original and genuine contents to reveal personal information such as security number, credit card number, password, etc. There are many anti-phishing techniques such as blacklist- or whitelist-, heuristic-feature- and visual-similarity-based methods proposed as of today. Modern browsers adapt to reduce the chances of users getting trapped into a vicious agenda, but still users fall as prey to phishers and end up revealing their secret information. In a previous work, the authors proposed a machine learning approach based on heuristic features for phishing website detection and achieved an accuracy of 99.5% using 18 features. In this paper, we have proposed novel phishing URL detection models using (a) Deep Neural Network (DNN), (b) Long Short-Term Memory (LSTM) and (c) Convolution Neural Network (CNN) using only 10 features of our earlier work. The proposed technique achieves an accuracy of 99.52% for DNN, 99.57% for LSTM and 99.43% for CNN. The proposed techniques utilize only one third-party service feature, thus making it more robust to failure and increases the speed of phishing detection.
Similar content being viewed by others
References
Rao R S and Pais A R 2019 Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31: 3851–3873
APWG 2018 Phishing attack trends reports, first quarter 2018. https://docs.apwg.org//reports/apwg_trends_report_q1_2018.pdf, published July 31, 2018
Fu A Y, Wenyin L and Deng X 2006 Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (emd). IEEE Trans. Dependable Secure Comput. 3: 301–311
Wenyin L, Huang G, Xiaoyue L, Min Z and Deng X 2005 Detection of phishing webpages based on visual similarity. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, ACM, pp. 1060–1061
Hara M, Yamada A and Miyake Y 2009 Visual similarity-based phishing detection without victim site information. In: Proceedings of the IEEE Symposium on Computational Intelligence in Cyber Security, CICS’09, IEEE, pp. 30–36
Rao R S and Ali S T 2015 A computer vision technique to detect phishing attacks. In: Proceedings of the Fifth International Conference on Communication Systems and network technologies (CSNT), IEEE, pp. 596–601
Khonji M, Iraqi Y and Jones A 2013 Phishing detection: a literature survey. IEEE Commun. Surv. Tutor. 15: 2091–2121
Zhang N and Yuan Y 2012 Phishing detection using neural network. Technical Report, Department of Computer Science, Department of Statistics, Stanford University (CS229 Lecture Notes)
Le H, Pham Q, Sahoo D and Hoi S C 2018 URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint: arXiv:180203162
Bahnsen A C, Bohorquez E C, Villegas S, Vargas J and González F A 2017 Classifying phishing URLs using recurrent neural networks. In: Proceedings of the APWG Symposium on Electronic Crime Research (eCrime), IEEE, pp. 1–8
Whittaker C, Ryner B and Nazif M 2010 Large-scale automatic classification of phishing pages. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), vol. 10
Huh J H and Kim H 2011 Phishing detection with popular search engines: simple and effective. In: Proceedings of the International Symposium on Foundations and Practice of Security. Springer, pp. 194–207
Jain A K and Gupta B B 2018 Two-level authentication approach to protect from phishing attacks in real time. J. Ambient Intell. Humaniz. Comput. 9: 1783–1796
APWG 2014 Global phishing reports first half 2014. https://docs.apwg.org//reports/APWG_Global_Phishing_Report_1H_2014.pdf, published 25 September 2014
Cao Y, Han W and Le Y 2008 Anti-phishing based on automated individual white-list. In: Proceedings of the 4th ACM Workshop on Digital Identity Management, ACM, pp. 51–60
Zhang J, Porras P A and Ullrich J 2008 Highly predictive blacklisting. In: Proceedings of the USENIX Security Symposium, pp. 107–122
Rao R S and Pais A R 2017 An enhanced blacklist method to detect phishing websites. In: Proceedings of the International Conference on Information Systems Security. Springer, pp. 323–333
Zhang Y, Hong J I and Cranor L F 2007 Cantina: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 639–648
Pan Y and Ding X 2006 December Anomaly based web phishing page detection. In: Proceedings of the 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), IEEE, pp. 381–392
Horng M H S, Fan P, Khan M, Run R and Chen J L R 2011 An efficient phishing webpage detector. Expert Syst. Appl. Int. J. 38: 12018–12027
Gowtham R and Krishnamurthi I 2014 A comprehensive and efficacious architecture for detecting phishing webpages. Comput. Secur. 40: 23–37
Srinivasa Rao R and Pais A R 2017 Detecting phishing websites using automation of human behavior. In: Proceedings of the 3rd ACM Workshop on Cyber-Physical System Security, ACM, pp. 33–42
Xiang G, Hong J, Rose C P and Cranor L 2011 Cantina+: a feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. 14(2): 1–28
Zhang D, Yan Z, Jiang H and Kim T 2014 A domain-feature enhanced classification model for the detection of Chinese phishing e-business websites. Inf. Manag. 51: 845–853
Chiew K L, Chang E H and Tiong W K 2015 Utilisation of website logo for phishing detection. Comput. Secur. 54: 16–26
Moghimi M and Varjani A Y 2016 New rule-based phishing detection method. Expert Syst. Appl. 53: 231–242
Aggarwal A, Rajadesingan A and Kumaraguru P 2012 Phishari: automatic realtime phishing detection on twitter. In: Proceedings of the eCrime Researchers Summit (eCrime), IEEE, pp. 1–12
Marchal S, Armano G, Gröndahl T, Saari K, Singh N and Asokan N 2017 Off-the-hook: an efficient and usable client-side phishing prevention application. IEEE Trans. Comput. 66: 1717–1733
Sahingoz OK, Buber E, Demir O and Diri B 2019 Machine learning based phishing detection from URLs. Expert Syst. Appl. 117: 345–357
Li Y, Yang Z, Chen X, Yuan H and Liu W 2019 A stacking model using URL and HTML features for phishing webpage detection. Future Gener. Comput. Syst. 94: 27–39
Jain A K and Gupta B B 2018 Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 68: 687–700
Yang P, Zhao G and Zeng P 2019 Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7: 15196–15209
El-Alfy ESM 2017 Detection of phishing websites based on probabilistic neural networks and K-medoids clustering. Comput. J. 60: 1745–1759
Zhao J, Wang N, Ma Q and Cheng Z 2018 Classifying malicious URLs using gated recurrent neural networks. In: Proceedings of the International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. Springer, pp. 385–394
Mohammad R M, Thabtah F and McCluskey L 2014 Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25: 443–458
Feng F, Zhou Q, Shen Z, Yang X, Han L and Wang J 2018 The application of a novel neural network in the detection of phishing websites. J. Ambient Intelli. Humaniz. Comput. 1–15
Yi P, Guan Y, Zou F, Yao Y, Wang W and Zhu T 2018 Web phishing detection using a deep learning framework. Wirel. Commun. Mobile Comput. 2018: Article ID 4678746
Zhou Q, Chen H, Zhao H, Zhang G, Yong J and Shen J 2016 A local field correlated and Monte Carlo based shallow neural network model for non-linear time series prediction. EAI Endorsed Trans. Scalable Inf. Syst. 3: e5-1–e5-7
Quinlan J R 1986 Induction of decision trees. Mach. Learn. 1:81–106
Smith C and Jin Y 2014 Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction. Neurocomputing 143: 302–311
Mikolov T, Joulin A, Chopra S, Mathieu M and Ranzato M A 2014 Learning longer memory in recurrent neural networks. arXiv preprint: arXiv:1412.7753
Jozefowicz R, Zaremba W and Sutskever I 2015 An empirical exploration of recurrent network architectures. In: Proceedings of the International Conference on Machine Learning, pp. 2342–2350
Hochreiter S, Schmidhuber J 1997 Long short-term memory. Neural Comput. 9: 1735–1780
Krizhevsky A, Sutskever I and Hinton G E 2012 Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105
Pham N Q, Kruszewski G and Boleda G 2016 Convolutional neural network language models. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1153–1162
Ramesh G, Krishnamurthi I and Kumar K S S 2014 An efficacious method for detecting phishing webpages through target domain identification. Decis. Support Syst. 61: 12–22
He M, Horng S J, Fan P, Khan M K, Run R S, Lai J L, Chen R J and Sutanto A 2011 An efficient phishing webpage detector. Expert Syst. Appl. 38: 12,018–12,027
Marchal S, Armano G, Gröndahl T, Saari K, Singh N and Asokan N 2017 Off-the-hook: an efficient and usable client-side phishing prevention application. IEEE Trans. Comput. 66: 1717–1733
Gowtham R and Krishnamurthi I 2014 A comprehensive and efficacious architecture for detecting phishing webpages. Comput Secur 40: 23–37
Acknowledgements
This research was funded by the Ministry of Electronics and Information Technology (MeitY), Government of India. The authors sincerely thank MeitY for financial support. The authors thank the anonymous referees for their comments and criticism, which have helped to improve the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Somesha, M., Pais, A.R., Rao, R.S. et al. Efficient deep learning techniques for the detection of phishing websites. Sādhanā 45, 165 (2020). https://doi.org/10.1007/s12046-020-01392-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12046-020-01392-4