当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Spam filtering using a logistic regression model trained by an artificial bee colony algorithm
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-03-16 , DOI: 10.1016/j.asoc.2020.106229
Bilge Kagan Dedeturk , Bahriye Akay

Email spam is a serious problem that annoys recipients and wastes their time. Machine-learning methods have been prevalent in spam detection systems owing to their efficiency in classifying mail as solicited or unsolicited. However, existing spam detection techniques usually suffer from low detection rates and cannot efficiently handle high-dimensional data. Therefore, we propose a novel spam detection method that combines the artificial bee colony algorithm with a logistic regression classification model. The empirical results on three publicly available datasets (Enron, CSDMC2010, and TurkishEmail) show that the proposed model can handle high-dimensional data thanks to its highly effective local and global search abilities. We compare the proposed model’s spam detection performance to those of support vector machine, logistic regression, and naive Bayes classifiers, in addition to the performance of the state-of-the-art methods reported by previous studies. We observe that the proposed method outperforms other spam detection techniques considered in this study in terms of classification accuracy.



中文翻译:

使用由人工蜂群算法训练的逻辑回归模型过滤垃圾邮件

电子邮件垃圾邮件是一个使收件人烦恼并浪费时间的严重问题。机器学习方法在垃圾邮件检测系统中很普遍,这是由于它们可以有效地将邮件分类为请求的还是未经请求的邮件。但是,现有的垃圾邮件检测技术通常具有较低的检测率,并且不能有效地处理高维数据。因此,我们提出了一种新颖的垃圾邮件检测方法,该方法将人工蜂群算法与逻辑回归分类模型相结合。对三个公开可用的数据集(Enron,CSDMC2010和TurkishEmail)的经验结果表明,由于该模型具有高效的本地和全局搜索能力,因此可以处理高维数据。我们将提出的模型的垃圾邮件检测性能与支持向量机,逻辑回归,和朴素的贝叶斯分类器,以及先前研究报告的最新方法的性能。我们观察到,就分类准确性而言,该方法优于本研究中考虑的其他垃圾邮件检测技术。

更新日期:2020-03-16
down
wechat
bug