Efficient deep learning techniques for the detection of phishing websites

Somesha, M; Pais, Alwyn Roshan; Rao, Routhu Srinivasa; Rathour, Vikram Singh

doi:10.1007/s12046-020-01392-4

Efficient deep learning techniques for the detection of phishing websites

Published: 27 June 2020

Volume 45, article number 165, (2020)
Cite this article

Sādhanā Aims and scope Submit manuscript

M Somesha¹,
Alwyn Roshan Pais¹,
Routhu Srinivasa Rao¹ &
…
Vikram Singh Rathour¹

1648 Accesses
59 Citations
3 Altmetric
Explore all metrics

Abstract

Phishing is a fraudulent practice and a form of cyber-attack designed and executed with the sole purpose of gathering sensitive information by masquerading the genuine websites. Phishers fool users by replicating the original and genuine contents to reveal personal information such as security number, credit card number, password, etc. There are many anti-phishing techniques such as blacklist- or whitelist-, heuristic-feature- and visual-similarity-based methods proposed as of today. Modern browsers adapt to reduce the chances of users getting trapped into a vicious agenda, but still users fall as prey to phishers and end up revealing their secret information. In a previous work, the authors proposed a machine learning approach based on heuristic features for phishing website detection and achieved an accuracy of 99.5% using 18 features. In this paper, we have proposed novel phishing URL detection models using (a) Deep Neural Network (DNN), (b) Long Short-Term Memory (LSTM) and (c) Convolution Neural Network (CNN) using only 10 features of our earlier work. The proposed technique achieves an accuracy of 99.52% for DNN, 99.57% for LSTM and 99.43% for CNN. The proposed techniques utilize only one third-party service feature, thus making it more robust to failure and increases the speed of phishing detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phishing Detection Using Computer Vision

A Survey on Phishing Website Detection Using Deep Neural Networks

Deep Learning-Based Framework for URL Phishing Detection

Notes

References

Rao R S and Pais A R 2019 Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31: 3851–3873
Article Google Scholar
APWG 2018 Phishing attack trends reports, first quarter 2018. https://docs.apwg.org//reports/apwg_trends_report_q1_2018.pdf, published July 31, 2018
Fu A Y, Wenyin L and Deng X 2006 Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (emd). IEEE Trans. Dependable Secure Comput. 3: 301–311
Article Google Scholar
Wenyin L, Huang G, Xiaoyue L, Min Z and Deng X 2005 Detection of phishing webpages based on visual similarity. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, ACM, pp. 1060–1061
Hara M, Yamada A and Miyake Y 2009 Visual similarity-based phishing detection without victim site information. In: Proceedings of the IEEE Symposium on Computational Intelligence in Cyber Security, CICS’09, IEEE, pp. 30–36
Rao R S and Ali S T 2015 A computer vision technique to detect phishing attacks. In: Proceedings of the Fifth International Conference on Communication Systems and network technologies (CSNT), IEEE, pp. 596–601
Khonji M, Iraqi Y and Jones A 2013 Phishing detection: a literature survey. IEEE Commun. Surv. Tutor. 15: 2091–2121
Article Google Scholar
Zhang N and Yuan Y 2012 Phishing detection using neural network. Technical Report, Department of Computer Science, Department of Statistics, Stanford University (CS229 Lecture Notes)
Le H, Pham Q, Sahoo D and Hoi S C 2018 URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint: arXiv:180203162
Bahnsen A C, Bohorquez E C, Villegas S, Vargas J and González F A 2017 Classifying phishing URLs using recurrent neural networks. In: Proceedings of the APWG Symposium on Electronic Crime Research (eCrime), IEEE, pp. 1–8
Whittaker C, Ryner B and Nazif M 2010 Large-scale automatic classification of phishing pages. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), vol. 10
Huh J H and Kim H 2011 Phishing detection with popular search engines: simple and effective. In: Proceedings of the International Symposium on Foundations and Practice of Security. Springer, pp. 194–207
Jain A K and Gupta B B 2018 Two-level authentication approach to protect from phishing attacks in real time. J. Ambient Intell. Humaniz. Comput. 9: 1783–1796
Article Google Scholar
APWG 2014 Global phishing reports first half 2014. https://docs.apwg.org//reports/APWG_Global_Phishing_Report_1H_2014.pdf, published 25 September 2014
Cao Y, Han W and Le Y 2008 Anti-phishing based on automated individual white-list. In: Proceedings of the 4th ACM Workshop on Digital Identity Management, ACM, pp. 51–60
Zhang J, Porras P A and Ullrich J 2008 Highly predictive blacklisting. In: Proceedings of the USENIX Security Symposium, pp. 107–122
Rao R S and Pais A R 2017 An enhanced blacklist method to detect phishing websites. In: Proceedings of the International Conference on Information Systems Security. Springer, pp. 323–333
Zhang Y, Hong J I and Cranor L F 2007 Cantina: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 639–648
Pan Y and Ding X 2006 December Anomaly based web phishing page detection. In: Proceedings of the 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), IEEE, pp. 381–392
Horng M H S, Fan P, Khan M, Run R and Chen J L R 2011 An efficient phishing webpage detector. Expert Syst. Appl. Int. J. 38: 12018–12027
Article Google Scholar
Gowtham R and Krishnamurthi I 2014 A comprehensive and efficacious architecture for detecting phishing webpages. Comput. Secur. 40: 23–37
Article Google Scholar
Srinivasa Rao R and Pais A R 2017 Detecting phishing websites using automation of human behavior. In: Proceedings of the 3rd ACM Workshop on Cyber-Physical System Security, ACM, pp. 33–42
Xiang G, Hong J, Rose C P and Cranor L 2011 Cantina+: a feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. 14(2): 1–28
Article Google Scholar
Zhang D, Yan Z, Jiang H and Kim T 2014 A domain-feature enhanced classification model for the detection of Chinese phishing e-business websites. Inf. Manag. 51: 845–853
Article Google Scholar
Chiew K L, Chang E H and Tiong W K 2015 Utilisation of website logo for phishing detection. Comput. Secur. 54: 16–26
Article Google Scholar
Moghimi M and Varjani A Y 2016 New rule-based phishing detection method. Expert Syst. Appl. 53: 231–242
Article Google Scholar
Aggarwal A, Rajadesingan A and Kumaraguru P 2012 Phishari: automatic realtime phishing detection on twitter. In: Proceedings of the eCrime Researchers Summit (eCrime), IEEE, pp. 1–12
Marchal S, Armano G, Gröndahl T, Saari K, Singh N and Asokan N 2017 Off-the-hook: an efficient and usable client-side phishing prevention application. IEEE Trans. Comput. 66: 1717–1733
Article MathSciNet Google Scholar
Sahingoz OK, Buber E, Demir O and Diri B 2019 Machine learning based phishing detection from URLs. Expert Syst. Appl. 117: 345–357
Article Google Scholar
Li Y, Yang Z, Chen X, Yuan H and Liu W 2019 A stacking model using URL and HTML features for phishing webpage detection. Future Gener. Comput. Syst. 94: 27–39
Article Google Scholar
Jain A K and Gupta B B 2018 Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 68: 687–700
Article Google Scholar
Yang P, Zhao G and Zeng P 2019 Phishing website detection based on multidimensional features driven by deep learning. IEEE Access 7: 15196–15209
Article Google Scholar
El-Alfy ESM 2017 Detection of phishing websites based on probabilistic neural networks and K-medoids clustering. Comput. J. 60: 1745–1759
Article Google Scholar
Zhao J, Wang N, Ma Q and Cheng Z 2018 Classifying malicious URLs using gated recurrent neural networks. In: Proceedings of the International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. Springer, pp. 385–394
Mohammad R M, Thabtah F and McCluskey L 2014 Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25: 443–458
Article Google Scholar
Feng F, Zhou Q, Shen Z, Yang X, Han L and Wang J 2018 The application of a novel neural network in the detection of phishing websites. J. Ambient Intelli. Humaniz. Comput. 1–15
Yi P, Guan Y, Zou F, Yao Y, Wang W and Zhu T 2018 Web phishing detection using a deep learning framework. Wirel. Commun. Mobile Comput. 2018: Article ID 4678746
Zhou Q, Chen H, Zhao H, Zhang G, Yong J and Shen J 2016 A local field correlated and Monte Carlo based shallow neural network model for non-linear time series prediction. EAI Endorsed Trans. Scalable Inf. Syst. 3: e5-1–e5-7
Quinlan J R 1986 Induction of decision trees. Mach. Learn. 1:81–106
Google Scholar
Smith C and Jin Y 2014 Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction. Neurocomputing 143: 302–311
Article Google Scholar
Mikolov T, Joulin A, Chopra S, Mathieu M and Ranzato M A 2014 Learning longer memory in recurrent neural networks. arXiv preprint: arXiv:1412.7753
Jozefowicz R, Zaremba W and Sutskever I 2015 An empirical exploration of recurrent network architectures. In: Proceedings of the International Conference on Machine Learning, pp. 2342–2350
Hochreiter S, Schmidhuber J 1997 Long short-term memory. Neural Comput. 9: 1735–1780
Article Google Scholar
Krizhevsky A, Sutskever I and Hinton G E 2012 Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105
Pham N Q, Kruszewski G and Boleda G 2016 Convolutional neural network language models. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1153–1162
Ramesh G, Krishnamurthi I and Kumar K S S 2014 An efficacious method for detecting phishing webpages through target domain identification. Decis. Support Syst. 61: 12–22
Article Google Scholar
He M, Horng S J, Fan P, Khan M K, Run R S, Lai J L, Chen R J and Sutanto A 2011 An efficient phishing webpage detector. Expert Syst. Appl. 38: 12,018–12,027
Article Google Scholar
Marchal S, Armano G, Gröndahl T, Saari K, Singh N and Asokan N 2017 Off-the-hook: an efficient and usable client-side phishing prevention application. IEEE Trans. Comput. 66: 1717–1733
Article MathSciNet Google Scholar
Gowtham R and Krishnamurthi I 2014 A comprehensive and efficacious architecture for detecting phishing webpages. Comput Secur 40: 23–37
Article Google Scholar

Download references

Acknowledgements

This research was funded by the Ministry of Electronics and Information Technology (MeitY), Government of India. The authors sincerely thank MeitY for financial support. The authors thank the anonymous referees for their comments and criticism, which have helped to improve the quality of the paper.

Author information

Authors and Affiliations

Information Security Research Lab, National Institute of Technology Karnataka, Surathkal, 575025, India
M Somesha, Alwyn Roshan Pais, Routhu Srinivasa Rao & Vikram Singh Rathour

Authors

M Somesha
View author publications
You can also search for this author in PubMed Google Scholar
Alwyn Roshan Pais
View author publications
You can also search for this author in PubMed Google Scholar
Routhu Srinivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Singh Rathour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M Somesha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Somesha, M., Pais, A.R., Rao, R.S. et al. Efficient deep learning techniques for the detection of phishing websites. Sādhanā 45, 165 (2020). https://doi.org/10.1007/s12046-020-01392-4

Download citation

Received: 18 May 2019
Revised: 10 January 2020
Accepted: 12 February 2020
Published: 27 June 2020
DOI: https://doi.org/10.1007/s12046-020-01392-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient deep learning techniques for the detection of phishing websites

Abstract

Access this article

Similar content being viewed by others

Phishing Detection Using Computer Vision

A Survey on Phishing Website Detection Using Deep Neural Networks

Deep Learning-Based Framework for URL Phishing Detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient deep learning techniques for the detection of phishing websites

Abstract

Access this article

Similar content being viewed by others

Phishing Detection Using Computer Vision

A Survey on Phishing Website Detection Using Deep Neural Networks

Deep Learning-Based Framework for URL Phishing Detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation