当前位置: X-MOL 学术Comput. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Variable population-sized particle swarm optimization for highly imbalanced dataset classification
Computational Intelligence ( IF 2.8 ) Pub Date : 2021-02-04 , DOI: 10.1111/coin.12436
Devi Priya Rangasamy 1 , Sivaraj Rajappan 2 , Anitha Natarajan 1 , Rajadevi Ramasamy 1 , Devisurya Vijayakumar 1, 2
Affiliation  

Real-world datasets used for classification often face many challenges when they are imbalanced in nature which is unavoidable and need to be handled by analysts. Many researchers have proposed methods for handling imbalanced datasets and they mostly concentrated on handling binary classification with only two class labels. Only very few research works have been carried out for treating highly imbalanced datasets and fail to handle multiclass datasets. To address imbalance problem in multiclass datasets, this paper proposes a Variable Population sized Particle Swarm Optimization (VPPSO) which is a modified version of Particle Swarm Optimization (PSO) which works based on clustering. PSO usually has fixed population size and has high computational complexity. In order to reduce this, the population size is varied over generations and the particles are loaded into the population iteratively by retaining the balance nature of solutions. PSO optimizes the selection of training and testing samples from each class label in imbalanced datasets for improved classification results. From the implementation results, it is evident that using VPPSO, highly imbalanced datasets with multiclass attributes are classified more efficiently than state-of-the-art algorithms. The statistical results also prove the superior performance of VPPSO.

中文翻译:

高度不平衡的数据集分类的可变种群大小粒子群优化

当分类中的现实世界数据集本质上不平衡时,通常会面临许多挑战,这是不可避免的,需要由分析人员处理。许多研究人员提出了处理不平衡数据集的方法,并且他们主要集中于仅使用两个类标签来处理二进制分类。对于处理高度不平衡的数据集,只有很少的研究工作,并且无法处理多类数据集。为了解决多类数据集中的不平衡问题,本文提出了一种可变种群大小的粒子群优化算法(VPPSO),它是基于聚类的粒子群优化算法(PSO)的改进版本。PSO通常具有固定的人口规模,并且具有很高的计算复杂性。为了减少这种情况,种群的大小随着世代的变化而变化,并且通过保持溶液的平衡性质,将粒子迭代地加载到种群中。PSO优化了不平衡数据集中每个类别标签的训练和测试样本的选择,从而改善了分类结果。从实施结果来看,很明显,使用VPPSO可以比最先进的算法更有效地对具有多类属性的高度不平衡的数据集进行分类。统计结果也证明了VPPSO的优越性能。与最新算法相比,具有多类属性的高度不平衡的数据集的分类效率更高。统计结果也证明了VPPSO的优越性能。与最新算法相比,具有多类属性的高度不平衡的数据集的分类效率更高。统计结果也证明了VPPSO的优越性能。
更新日期:2021-02-04
down
wechat
bug