当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature subset selection via an improved discretization-based particle swarm optimization
Applied Soft Computing ( IF 7.2 ) Pub Date : 2020-11-12 , DOI: 10.1016/j.asoc.2020.106794
Yu Zhou , Jiping Lin , Hainan Guo

High-dimensional data analysis has attracted increasingly attention in machine learning or data mining tasks. Due to the existence of irrelevant and redundant features, classification accuracy is often degraded seriously. Feature selection (FS), which aims to improve the predictive accuracy by selecting a subset of features, plays a very important role. In this paper, we proposed an improved discretization-based particle swarm optimization (PSO) for FS. In our method, we applied a moderate pre-screening process to obtain a reduced size of features at first. Then, a ranking-based cut-point table that stores multiple cut-points sorted by an entropy-based cut-point priority for each feature was obtained. To find the optimal combination of the cut-points that could best distinguish the data samples, a simple yet efficient encoding and decoding approach in PSO was used to select a flexible number of cut-points. Moreover, a probability-guided local search strategy was applied to search for better combination of cut-points to achieve promising feature subset. Comprehensive simulation results on 19 benchmark datasets demonstrate the effectiveness of several improved strategies in our proposed method and the advantages of our proposed method over some state-of-the-art PSO-based competitors.



中文翻译:

通过改进的基于离散化的粒子群算法进行特征子集选择

在机器学习或数据挖掘任务中,高维数据分析已引起越来越多的关注。由于不相关和多余的特征的存在,分类准确性经常严重降低。旨在通过选择特征子集来提高预测准确性的特征选择(FS)发挥着非常重要的作用。在本文中,我们为FS提出了一种改进的基于离散化的粒子群优化算法(PSO)。在我们的方法中,我们首先应用了适度的预筛选过程以减小特征的大小。然后,获得了基于排名的切点表,该表存储了针对每个特征按基于熵的切点优先级进行排序的多个切点。为了找到可以最好地区分数据样本的最佳切点组合,在PSO中使用一种简单而有效的编码和解码方法来选择灵活数量的切点。而且,采用了概率指导的局部搜索策略来搜索切点的更好组合,以实现有希望的特征子集。在19个基准数据集上的综合仿真结果证明了我们提出的方法中几种改进策略的有效性以及我们提出的方法相对于一些基于PSO的最新竞争者的优势。

更新日期:2020-11-12
down
wechat
bug