Skip to main content
Log in

Spam detection through feature selection using artificial neural network and sine–cosine algorithm

  • Original Research
  • Published:
Mathematical Sciences Aims and scope Submit manuscript

Abstract

Detection of spam and non-spam emails is considered a great challenge for email service providers and users alike. Spam emails waste the Internet traffic and also contain malicious links that mostly direct users to phishing webpages. Another challenge of spams is their role in spreading malware on the network, further emphasizing the need for their detection. Despite the application of data mining methods such as artificial neural networks (ANNs) in spam detection, these methods are prone to a significant error in their output mostly due to including all the spam features in their training stage. To reduce the spam detection error, a feature selection-based method was provided in this paper using the sine–cosine algorithm (SCA). In the proposed method, feature vectors are updated by the SCA to select the optimal features for training the ANN. Implementation of the proposed method of the Spambase dataset in MATLAB indicated a precision, accuracy and sensitivity of 98.64%, 97.92% and 98.36%, respectively. In other words, the proposed method outperformed the multilayer perceptron (MLP) neural network, Bayesian network, decision tree and random forest classifiers in terms of spam detection. According to the test results, the feature selection error in the MLP neural network decreased by approximately 2.18% using the SCA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Ferrara, E.: The history of digital spam. arXiv preprint arXiv:1908.06173 (2019)

  2. Ren, Y., Ji, D.: Learning to detect deceptive opinion spam: a survey. IEEE Access 7, 42934–42945 (2019)

    Article  Google Scholar 

  3. Broadhurst, R., Trivedi, H.: Malware in spam email: trends in the 2016 Australian Spam Intelligence Data. Available at SSRN 3413442 (2018)

  4. Kumar, V., Kumar, P., Sharma, A.: Spam email detection using ID3 algorithm and hidden Markov model. In: 2018 Conference on Information and Communication Technology (CICT), pp. 1–6, IEEE (2018)

  5. Fang, Y., Zhang, C., Huang, C., Liu, L., Yang, Y.: Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 7, 56329–56340 (2019)

    Article  Google Scholar 

  6. Ji, S., Ma, H., Liang, Y., Leung, H., Zhang, C.: Correction to: a whitelist and blacklist-based co-evolutionary strategy for defensing against multifarious trust attacks. Appl. Intell. 48(7), 1891 (2018)

    Article  Google Scholar 

  7. Caraffini, F., Neri, F., Epitropakis, M.: HyperSpam: a study on hyper-heuristic coordination strategies in the continuous domain. Inf. Sci. 477, 189–202 (2019)

    Article  Google Scholar 

  8. Sharaff. A., Gupta, H.: Extra-tree classifier with metaheuristic approach for email classification. In: Advances in Computer Communication and Computational Sciences, pp. 189–197. Springer, Singapore (2019)

  9. Salihovic, I., Serdarevic, H., Kervic, J.: The role of feature selection in machine learning for detection of spam and phishing attacks. In: International Symposium on Innovative and Interdisciplinary Applications of Advanced Technologies, pp. 476–483. Springer, Cham (2018)

  10. Alghoul, A., Al Ajrami, S., Al Jarousha, G., Harb, G., Abu-Naser, S. S.: Email classification using artificial neural network. Int. J. Acad. Dev. 2(11), 8–14 (2018)

    Google Scholar 

  11. Yu, S.: Covert communication by means of email spam: a challenge for digital investigation. Digit. Invest. 13, 72–79 (2015)

    Article  Google Scholar 

  12. Aleroud, A., Zhou, L.: Phishing environments, techniques, and countermeasures: a survey. Comput. Secur. 68, 160–196 (2017)

    Article  Google Scholar 

  13. Fang, Y., Zhang, C., Huang, C., Liu, L., Yang, Y.: Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 7, 374–406 (2019)

    Google Scholar 

  14. Gupta, S., Deep, K.: Improved sine cosine algorithm with crossover scheme for global optimization. Knowl. Based Syst. 165, 374–406 (2019)

    Article  Google Scholar 

  15. Venkatraman, S., Surendiran, B., Kumar, P.A.R.: Spam e-mail classification for the Internet of Things environment using semantic similarity approach. J. Supercomput. 76, 756–776 (2020)

    Article  Google Scholar 

  16. Asghar, M.Z., Ullah, A., Ahmad, S., Khan, A.: Opinion spam detection framework using hybrid classification scheme. Soft Comput. 24, 3475–3498 (2020)

    Article  Google Scholar 

  17. Citlak, O., Dorterler, M., Dogru, I.A.: A survey on detecting spam accounts on Twitter network. SNAM 9(1), 35 (2019)

    Google Scholar 

  18. Shuaib, M., Adebayo, O.S., Osho, O., Idris, I., Alhasan, J.K., Rana, N.: Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification. SN Appl. Sci. 1(5), 390 (2019)

    Article  Google Scholar 

  19. Mokri, M.A.E.S., Hamou, R.M., Amine, A.A.: New bio-inspired technique based on octopus algorithm for spam filtering. Appl. Intell. 49, 3425–3435 (2019)

    Article  Google Scholar 

  20. Chikh, R., Chikhi, S.: Clustered negative selection algorithm and fruit fly algorithm based email spam classification. J. Ambient Intell. Hum. Comput. 10(1), 143–152 (2019)

    Article  Google Scholar 

  21. Kumaresan, T., Saravanakumar, S., Balamurugan, R.: Visual and textual features based email spam classification using S-Cuckoo search and hybrid kernel support vector machine. Clust. Comput. 22(1), 33–46 (2019)

    Article  Google Scholar 

  22. Shuaib, M., Osho, O., Ismaila, I., Alhasan, J.K.: Comparative analysis of classification algorithms for email spam detection. IJCNIS 10(1), 60 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaser Rostami.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Talaei Pashiri, R., Rostami, Y. & Mahrami, M. Spam detection through feature selection using artificial neural network and sine–cosine algorithm. Math Sci 14, 193–199 (2020). https://doi.org/10.1007/s40096-020-00327-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40096-020-00327-8

Keywords

Navigation