当前位置: X-MOL 学术Int. J. Mach. Learn. & Cyber. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel binary many-objective feature selection algorithm for multi-label data classification
International Journal of Machine Learning and Cybernetics ( IF 3.1 ) Pub Date : 2021-04-12 , DOI: 10.1007/s13042-021-01291-y
Azam Asilian Bidgoli , Hossein Ebrahimpour-komleh , Shahryar Rahnamayan

For Multi-label classification, redundant and irrelevant features degrade the performance of classification. To select the best features based on several conflicting objectives, feature selection can be modeled as a large-scale optimization problem. However, most existing multi-objective feature selection methods select the features based on minimizing two well-known objectives, the number of features and classification error, additional objectives can be considered to improve the classification performance. In this study, for the first time, a many-objective optimization method is proposed to select the efficient features for multi-label classification based on not only two mentioned objectives, but also maximizing the correlation between features and labels and minimizing the computational complexity of features. Maximizing the correlation could lead to increasing the accuracy of classification. On the other hand, selecting less complex features decreases the computational complexity of feature extraction phase. The most important aim of this paper is to tackle the multi-label feature selection based on the number of features, classification error, correlation between features and labels, and computational complexity of features, simultaneously. The conducted many-objective feature selection problem is solved using a proposed binary version of NSGA-III algorithm. The binary operator improves the exploration power of optimizer to search the large-scale space. In order to evaluate the proposed algorithm (called binary NSGA-III), a benchmarking experiments is conducted on eight multi-label datasets in terms of several multi-objective assessment metrics, including Hypervolume indicator, Pure Diversity, and Set-coverage. Experimental results show significant improvements for proposed method in comparison with other algorithms.



中文翻译:

一种新的用于多标签数据分类的二进制多目标特征选择算法

对于多标签分类,冗余和不相关的功能会降低分类性能。为了基于几个相互冲突的目标选择最佳特征,可以将特征选择建模为大规模优化问题。但是,大多数现有的多目标特征选择方法都是基于最小化两个众所周知的目标(特征数量和分类误差)来选择特征,可以考虑使用其他目标来提高分类性能。在这项研究中,首次提出了一种多目标优化方法,该方法不仅基于上述两个目标,而且还选择了有效的特征来进行多标签分类,同时还最大化了特征和标签之间的相关性,并最大程度地降低了标签的计算复杂度特征。最大化相关性可以提高分类的准确性。另一方面,选择不太复杂的特征会降低特征提取阶段的计算复杂性。本文的最重要目的是同时解决基于特征数量,分类错误,特征与标签之间的相关性以及特征的计算复杂性的多标签特征选择。所提出的多目标特征选择问题是使用提出的NSGA-III二进制算法解决的。二元运算符提高了优化器探索大规模空间的探索能力。为了评估所提出的算法(称为二进制NSGA-III),针对多个多目标评估指标,对八个多标签数据集进行了基准测试,包括“超量”指标,“纯分集”和“设置覆盖率”。实验结果表明,与其他算法相比,该方法具有明显的改进。

更新日期:2021-04-12
down
wechat
bug