A New Optimal Ensemble Algorithm Based on SVDD Sampling for Imbalanced Data Classification,International Journal of Pattern Recognition and Artificial Intelligence

当前位置： X-MOL 学术 › Int. J. Pattern Recognit. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A New Optimal Ensemble Algorithm Based on SVDD Sampling for Imbalanced Data Classification
International Journal of Pattern Recognition and Artificial Intelligence ( IF 0.9 ) Pub Date : 2020-12-24 , DOI: 10.1142/s0218001421500208
Jamshid Pirgazi ₁ , Abbas Pirmohammadi ₂ , Reza Shams ₃

Affiliation

Nowadays, imbalanced data classification is a hot topic in data mining and recently, several valuable researches have been conducted to overcome certain difficulties in the field. Moreover, those approaches, which are based on ensemble classifiers, have achieved reasonable results. Despite the success of these works, there are still many unsolved issues such as disregarding the importance of samples in balancing, determination of proper number of classifiers and optimizing weights of base classifiers in voting stage of ensemble methods. This paper intends to find an admissible solution for these challenges. The solution suggested in this paper applies the support vector data descriptor (SVDD) for sampling both minority and majority classes. After determining the optimal number of base classifiers, the selected samples are utilized to adjust base classifiers. Finally, genetic algorithm optimization is used in order to find the optimum weights of each base classifier in the voting stage. The proposed method is compared with some existing algorithms. The results of experiments confirm its effectiveness.

中文翻译：

基于SVDD采样的不平衡数据分类新优化集成算法

如今，不平衡数据分类是数据挖掘中的热门话题，最近，已经进行了一些有价值的研究以克服该领域的某些困难。此外，这些基于集成分类器的方法已经取得了合理的结果。尽管这些工作取得了成功，但仍然存在许多未解决的问题，例如忽略样本在平衡中的重要性，确定合适的分类器数量以及在集成方法的投票阶段优化基分类器的权重。本文旨在为这些挑战找到一个可接受的解决方案。本文建议的解决方案应用支持向量数据描述符 (SVDD) 对少数类和多数类进行采样。在确定最佳的基分类器数量后，选择的样本用于调整基分类器。最后，使用遗传算法优化，以便在投票阶段找到每个基分类器的最佳权重。所提出的方法与一些现有的算法进行了比较。实验结果证实了其有效性。

更新日期：2020-12-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11