Abstract
Detection of spam and non-spam emails is considered a great challenge for email service providers and users alike. Spam emails waste the Internet traffic and also contain malicious links that mostly direct users to phishing webpages. Another challenge of spams is their role in spreading malware on the network, further emphasizing the need for their detection. Despite the application of data mining methods such as artificial neural networks (ANNs) in spam detection, these methods are prone to a significant error in their output mostly due to including all the spam features in their training stage. To reduce the spam detection error, a feature selection-based method was provided in this paper using the sine–cosine algorithm (SCA). In the proposed method, feature vectors are updated by the SCA to select the optimal features for training the ANN. Implementation of the proposed method of the Spambase dataset in MATLAB indicated a precision, accuracy and sensitivity of 98.64%, 97.92% and 98.36%, respectively. In other words, the proposed method outperformed the multilayer perceptron (MLP) neural network, Bayesian network, decision tree and random forest classifiers in terms of spam detection. According to the test results, the feature selection error in the MLP neural network decreased by approximately 2.18% using the SCA.
Similar content being viewed by others
References
Ferrara, E.: The history of digital spam. arXiv preprint arXiv:1908.06173 (2019)
Ren, Y., Ji, D.: Learning to detect deceptive opinion spam: a survey. IEEE Access 7, 42934–42945 (2019)
Broadhurst, R., Trivedi, H.: Malware in spam email: trends in the 2016 Australian Spam Intelligence Data. Available at SSRN 3413442 (2018)
Kumar, V., Kumar, P., Sharma, A.: Spam email detection using ID3 algorithm and hidden Markov model. In: 2018 Conference on Information and Communication Technology (CICT), pp. 1–6, IEEE (2018)
Fang, Y., Zhang, C., Huang, C., Liu, L., Yang, Y.: Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 7, 56329–56340 (2019)
Ji, S., Ma, H., Liang, Y., Leung, H., Zhang, C.: Correction to: a whitelist and blacklist-based co-evolutionary strategy for defensing against multifarious trust attacks. Appl. Intell. 48(7), 1891 (2018)
Caraffini, F., Neri, F., Epitropakis, M.: HyperSpam: a study on hyper-heuristic coordination strategies in the continuous domain. Inf. Sci. 477, 189–202 (2019)
Sharaff. A., Gupta, H.: Extra-tree classifier with metaheuristic approach for email classification. In: Advances in Computer Communication and Computational Sciences, pp. 189–197. Springer, Singapore (2019)
Salihovic, I., Serdarevic, H., Kervic, J.: The role of feature selection in machine learning for detection of spam and phishing attacks. In: International Symposium on Innovative and Interdisciplinary Applications of Advanced Technologies, pp. 476–483. Springer, Cham (2018)
Alghoul, A., Al Ajrami, S., Al Jarousha, G., Harb, G., Abu-Naser, S. S.: Email classification using artificial neural network. Int. J. Acad. Dev. 2(11), 8–14 (2018)
Yu, S.: Covert communication by means of email spam: a challenge for digital investigation. Digit. Invest. 13, 72–79 (2015)
Aleroud, A., Zhou, L.: Phishing environments, techniques, and countermeasures: a survey. Comput. Secur. 68, 160–196 (2017)
Fang, Y., Zhang, C., Huang, C., Liu, L., Yang, Y.: Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 7, 374–406 (2019)
Gupta, S., Deep, K.: Improved sine cosine algorithm with crossover scheme for global optimization. Knowl. Based Syst. 165, 374–406 (2019)
Venkatraman, S., Surendiran, B., Kumar, P.A.R.: Spam e-mail classification for the Internet of Things environment using semantic similarity approach. J. Supercomput. 76, 756–776 (2020)
Asghar, M.Z., Ullah, A., Ahmad, S., Khan, A.: Opinion spam detection framework using hybrid classification scheme. Soft Comput. 24, 3475–3498 (2020)
Citlak, O., Dorterler, M., Dogru, I.A.: A survey on detecting spam accounts on Twitter network. SNAM 9(1), 35 (2019)
Shuaib, M., Adebayo, O.S., Osho, O., Idris, I., Alhasan, J.K., Rana, N.: Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification. SN Appl. Sci. 1(5), 390 (2019)
Mokri, M.A.E.S., Hamou, R.M., Amine, A.A.: New bio-inspired technique based on octopus algorithm for spam filtering. Appl. Intell. 49, 3425–3435 (2019)
Chikh, R., Chikhi, S.: Clustered negative selection algorithm and fruit fly algorithm based email spam classification. J. Ambient Intell. Hum. Comput. 10(1), 143–152 (2019)
Kumaresan, T., Saravanakumar, S., Balamurugan, R.: Visual and textual features based email spam classification using S-Cuckoo search and hybrid kernel support vector machine. Clust. Comput. 22(1), 33–46 (2019)
Shuaib, M., Osho, O., Ismaila, I., Alhasan, J.K.: Comparative analysis of classification algorithms for email spam detection. IJCNIS 10(1), 60 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Talaei Pashiri, R., Rostami, Y. & Mahrami, M. Spam detection through feature selection using artificial neural network and sine–cosine algorithm. Math Sci 14, 193–199 (2020). https://doi.org/10.1007/s40096-020-00327-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40096-020-00327-8