当前位置: X-MOL 学术Eng. Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A hybrid feature selection method based on information theory and binary butterfly optimization algorithm
Engineering Applications of Artificial Intelligence ( IF 8 ) Pub Date : 2020-11-12 , DOI: 10.1016/j.engappai.2020.104079
Zohre Sadeghian , Ebrahim Akbari , Hossein Nematzadeh

Feature selection is the problem of finding the optimal subset of features for predicting class labels by removing irrelevant or redundant features. S-shaped Binary Butterfly Optimization Algorithm (S-bBOA) is a nature-inspired algorithm for solving the feature selection problems. The evidence shows that S-bBOA has a better performance in exploration, exploitation, convergence, and avoidance of getting stuck in local optimal compared to other optimization algorithms. However, S-bBOA does not consider redundancy and relevancy of features. This paper proposes Information Gain binary Butterfly Optimization Algorithm (IG-bBOA), to overcome the S-bBOA constraints firstly. IG-bBOA maximizes both the classification accuracy and the mean of the mutual information between features and class labels. In addition, IG-bBOA also tries to minimize the number of selected features and is used within a three-phase proposed method called Ensemble Information Theory based binary Butterfly Optimization Algorithm (EIT-bBOA). In the first phase, 80% of irrelevant and redundant features are removed using Minimal Redundancy-Maximal New Classification Information (MR-MNCI) feature selection. In the second phase, the best feature subset is selected using IG-bBOA. Finally, a similarity based ranking method is used to select the final features subset. The experimental results are conducted using six standard datasets from UCI repository. The findings confirm the efficiency of the proposed method in improving the classification accuracy and selecting the best optimal features subset with minimum number of feature in most cases.



中文翻译:

基于信息论和二进制蝴蝶优化算法的混合特征选择方法

特征选择是通过删除不相关或多余的特征来找到用于预测类标签的最佳特征子集的问题。S形二进制蝴蝶优化算法(S-bBOA)是一种自然启发式算法,用于解决特征选择问题。证据表明,与其他优化算法相比,S-bBOA在探索,开发,收敛和避免陷入局部最优方面具有更好的性能。但是,S-bBOA没有考虑功能的冗余性和相关性。提出了信息增益二进制蝶形优化算法(IG-bBOA),首先克服了S-bBOA的约束。IG-bBOA可以最大程度地提高分类准确度以及要素和类别标签之间相互信息的平均值。此外,IG-bBOA还尝试最小化所选特征的数量,并在称为“基于集成信息论的二进制蝶形优化算法(EIT-bBOA)”的三个提出的方法中使用。在第一阶段,使用最小冗余-最大新分类信息(MR-MNCI)特征选择来删除80%的不相关和冗余特征。在第二阶段,使用IG-bBOA选择最佳特征子集。最后,基于相似度的排序方法用于选择最终特征子集。使用来自UCI储存库的六个标准数据集进行了实验结果。这些发现证实了在大多数情况下,该方法在提高分类精度和选择特征最少的最佳最优特征子集方面的有效性。

更新日期:2020-11-12
down
wechat
bug