Abstract
Email Spam is a variety of automated spam where unbidden messages, used for business purpose, sent extensively to multiple mailing lists, individuals or newsgroups. To build a fruitful system for spam detection, we introduced Random Forest integrated with Deep Neural network to find the classification accuracy. The Random Forest algorithm uses a preordained probability of attributes in constructing their decision trees. The Gini measure is examined to rank the important features. The main objective is to grade the features using RF algorithm and to train the data using Deep Neural Network Classifier. Deep Neural Network Classifier model (DNNs) are trained using backpropagation algorithm in batch learning mode, which requires the entire training data to learn at once. The detector process was dynamically fit to the new data patterns till it reaches the spam coverage. Experimental results shows that classification rate of DNN is higher than compared to KNN and Support Vector Machine(SVM) with an accuracy of 88.59% while considering the top ranked five features.
Similar content being viewed by others
Change history
30 May 2022
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s12652-022-03995-7
References
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for boltzmann machines. Cognit Sci 9:147–169
Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in Online Social Networks. Comput Commun 36:1120–1129
Androutsopoulos I, Koutsias J, Chandrinos KV, Spyropoulas CD (2000) An Experimental Comparison of Naïve Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages. In: Proceedings of the 23rd annual international ACM SIGIR conference on research development in information retrieval, pp 160–167
Araujo L, Martinez-Romo J (2010) Web spam detection : new classification features based on qualified link analysis and language models. IEEE Trans Inf Forensics Secur 5:581–590
Behjat AR, Mustapha A, Nezamabadi-pour H, Sulaiman MN (2012) GA based feature subset selection in a spam/non-spam detection system. In: International conference on computer and communication engineering (ICCCE), 675–679
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127. https://doi.org/10.1561/2200000006
Brutlag JD, Meek C (2000) Challenges of the email domain for text classification. In: ICML ’00 Proceedings of the 17th international conference on machine learning, pp 103–110
Caruana G, Li M, Liu Y (2013) An Ontology enhanced parallel SVM for scalable spam filter training. Neurocomputing 108:45–57
Christina V, Karpagavalli S, Suganya G (2010) A Study on Email Spam filtering techniques. Int J Comput Appl 12:0975–8887
Cui B, Mondal A, Shen J, Cong G, Tan KL (2005) On effective E-mail classification via Neural Networks . In: Andersen KV, Debenham J, Wagner R (eds) Database and expert systems applications. Lecture Notes in Computer Science, 3588. Springer, Berlin
Gee KR (2003) Using Latent semantic indexing to filter spam. SAC '03: Proceedings of the 2003 ACM symposium on applied computing March 2003:460–464. https://doi.org/10.1145/952532.952623
Geetha R, Sivasubramanian S, Kaliappan M (2019) Cervical cancer identification with synthetic minority oversampling technique and PCA analysis using random forest classifier. J Med Syst 43:286. https://doi.org/10.1007/s10916-019-1402-6
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Idris I, Selamat A (2014) Improved Email spam Detection model with negative selection algorithm and particle swarm optimization. Appl Soft Comput 22:11–27
Idris I, Selamat A (2015) A Combined negative selection algorithm—particle swarm optimization for an Email spam detection system. Eng Appl Artif Intell 39:33–44
Ilango S, Vimal S, Kaliappan M (2018) Optimization using Artificial Bee Colony based clustering approach for big data. Cluster Comput. https://doi.org/10.1007/s10586-017-1571-3
Kannan N, Sivasubramanian S, Kaliappan M, Vimal S, Suresh A (2018) Predictive big data analytic on demonetization data using support vector machine. Cluster Comput. https://doi.org/10.1007/s10586-018-2384-8
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient based learning applied to document recognition. Proc IEEE 86:2278–2324
Mariappan E, Kaliappan M, Vimal S (2016) Energy Efficient Routing protocol using Grover’s searching algorithm using MANET. Asian J Inf Technol 15:24
Pradeep Kumar Roy A, Singh JP, Banerjee S (2020) Deep learning to filter SMS Spam. Fut Gen Comput Syst 102:524–533
Renuka DK, Hamsapriya T, Chakkaravarthi MR, Surya PL (2011) Spam classification based on supervised learning using machine learning techniques. In: International conference on process automation, control and computing, IEEE, pp 1–7
Robinson G (2003) A statistical approach to the spam problem. Linux J
Roger SA, Patricia AJ, Joao FV (2018) An analysis of hierarchical text classification using word embeddings. Inf Sci 216–232
Sanghani G, Kotecha K (2019) Incremental personalized E-mail spam filter using novel TFDCR feature selection with dynamic feature update. Expert Syst Appl 115:287–299
Sculley D and Wachman GM (2007) Relaxed Online SVMs for spam filtering. Proceedings of the 30th International ACM SIGIR conference research and development in information retrieval
Silva RM, Almeida TA, Yamakami A (2012) Artificial neural networks for content- based web spam detection. In: Proceedings of the international conference on artificial intelligence (ICAI)
Sumathi S, Ganesh Kumar P (2019) Syntactic and Semantic based similarity measurement for Plagiarism Detection. Int J Innovat Technol Explor Eng 9:155–159
Suresh A, Udendhran R, Vimal S (2019a) An intelligent grid network based on cloud computing infrastructures. Novel practices and trends in grid and cloud computing. https://doi.org/10.4018/978-1-5225-9023-1
Suresh A, Udendhran R, Vimal S (2019b) Cloud-based predictive maintenance and machine monitoring for intelligent manufacturing for automobile industry novel practices and trends in grid and cloud computing. https://doi.org/10.4018/978-1-5225-9023-1
Tzortzis G, Likas A (2007) Deep belief networks for spam filtering. In: 19th IEEE international conference on tools with artificial intelligence, pp 306–309
Vimal S (2016) Secure data packet transmission in MANET using enhanced identity-based cryptography. Int J New Technol Sci Eng 3:35–42
Vimal S, Kalaivani LK (2017) Collaborative approach on mitigating spectrum sensing data hijack attack and dynamic spectrum allocation based on CASG modeling in wireless cognitive radio networks. Cluster Comput. https://doi.org/10.1007/s10586-017-1092-0
Vimal S, Kalaivani L, Kaliappan M, Suresh A, Xiao-Zhi G, Varatharajan R (2018) Development of secured data transmission using machine learning based discrete time partial observed markov model and energy optimization in Cognitive radio networks. Neural Comput & Applications. https://doi.org/10.1007/s00521-018-3788-3
Vimal S, Khari M, Dey N, Crespo RG, Robinson YH (2020a) Enhanced resource allocation in mobile edge computing using reinforcement learning based MOACO algorithm for IIOT. Comput Commun 151:355–364
Vimal S, Khari M, Crespo RG, Kalaivani L, Dey N, Kaliappan M (2020b) Energy enhancement using Multiobjective Ant colony optimization with Double Q learning algorithm for IoT based cognitive radio networks. Comput Commun 154:481–490
Vimal S, Kaliappan M, Suresh A (2020c) Development of cloud integrated internet of things based intruder detection system. J Comput Theor Nanosci 15:3565–3570
Wang AH (2010) Don’t follow me: spam detection in Twitter. In: International conference on security and cryptography (SECRYPT), Athens, pp 1–10
Yafeng R, Donghong J (2017) Neural networks for deceptive opinion spam detection: an empirical study. Inf Sci 385–386:213–224
Yu B, Xu Z (2008) A Comparative study for content based dynamic spam classification using four machine learning algorithms. Knowl Based Syst 21:355–362
Zhang YD, Wang S (2004) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl Based Syst 64:22–31
Zhou S, Chen Q (2014) Fuzzy deep belief networks for semi-supervised sentiment classification. Neurocomputing 131:312–322
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s12652-022-03995-7
About this article
Cite this article
Sumathi, S., Pugalendhi, G.K. RETRACTED ARTICLE: Cognition based spam mail text analysis using combined approach of deep neural network classifier and random forest. J Ambient Intell Human Comput 12, 5721–5731 (2021). https://doi.org/10.1007/s12652-020-02087-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02087-8