当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature selection based on label distribution and fuzzy mutual information
Information Sciences ( IF 8.1 ) Pub Date : 2021-06-07 , DOI: 10.1016/j.ins.2021.06.005
Chuanzhen Xiong , Wenbin Qian , Yinglong Wang , Jintao Huang

In multi-label learning, high-dimensionality is the most prominent characteristic of the data. An efficient pre-processing step, named feature selection, is required to reduce “the curse of dimensionality” caused by irrelevant and redundant features in the high-dimensional feature space. However, the difference in significance of the related labels of an instance is ubiquitous in most practical applications. Motivated by that, in this paper, the label distribution learning is integrated into multi-label feature selection, which is proposed to mine the more supervised information ignored by equivalence relations in the label space of multi-label data. With the perspective of granular computing, a novel label enhancement algorithm is presented based on the fuzzy similarity relation, which utilizes the similarity between instances to explore the hidden label relevance and transform the logical label in multi-label data into a label distribution. Then, a label distribution feature selection algorithm is presented to measure the significance of features with the fuzzy mutual information framework. Moreover, on twelve publicly available multi-label datasets, the presented algorithm is compared with six state-of-the-art multi-label feature selection algorithms. As indicated in the experimental results, the presented algorithm achieves significant improvement over the extant algorithms.



中文翻译:

基于标签分布和模糊互信息的特征选择

在多标签学习中,高维是数据最突出的特征。需要一个高效的预处理步骤,称为特征选择,以减少高维特征空间中不相关和冗余特征引起的“维数灾难”。然而,实例相关标签的重要性差异在大多数实际应用中无处不在。受此启发,本文将标签分布学习融入多标签特征选择,旨在挖掘多标签数据的标签空间中被等价关系忽略的更多监督信息。以在粒计算的基础上,提出了一种新的基于模糊相似关系的标签增强算法,该算法利用实例之间的相似性来探索隐藏标签的相关性,并将多标签数据中的逻辑标签转化为标签分布。然后,提出了一种标签分布特征选择算法,用模糊互信息框架来衡量特征的重要性。此外,在十二个公开可用的多标签数据集上,将所提出的算法与六种最先进的多标签特征选择算法进行了比较。实验结果表明,所提出的算法比现有算法取得了显着的改进。

更新日期:2021-06-20
down
wechat
bug