当前位置: X-MOL 学术Int. J. Approx. Reason. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature selection and threshold method based on fuzzy joint mutual information
International Journal of Approximate Reasoning ( IF 3.2 ) Pub Date : 2021-02-23 , DOI: 10.1016/j.ijar.2021.01.003
Omar A.M. Salem , Feng Liu , Yi-Ping Phoebe Chen , Xi Chen

Improving classification performance is one of the main challenges in a variety of real-world applications. Unfortunately, classification models are sensitive to undesirable features of data such as redundant and irrelevant features. Feature selection (FS) is a powerful solution to address the negative effect of these features. Among various methods, Feature selection based on mutual information (MI) is an effective method to select the significant features and deny the undesirable ones. Although most of the existing methods can estimate the feature relevancy efficiently, they may find difficulty to estimate the feature redundancy well. This is due to the individual estimation between the candidate feature and the features of pre-selected subset. To address this limitation, this paper introduces a novel feature selection method, called Fuzzy Joint Mutual Information (FJMI). The proposed method also overcomes common limitations as dealing with continuous feature without information loss and returns the best feature subset automatically without a user-defined threshold. To evaluate the effectiveness of the proposed method, FJMI is compared with six conventional and state-of-the-art feature selection methods. The experimental results on 16 benchmark datasets, with moderate size, show a promising improvement by the proposed method in terms of classification performance, feature selection stability, and number of selected features.



中文翻译:

基于模糊联合互信息的特征选择和阈值方法

改善分类性能是各种实际应用中的主要挑战之一。不幸的是,分类模型对数据的不良特征(例如冗余和无关的特征)敏感。功能选择(FS)是解决这些功能的负面影响的强大解决方案。在各种方法中,基于互信息(MI)的特征选择是一种有效的方法,用于选择重要特征并拒绝不受欢迎的特征。尽管大多数现有方法可以有效地估计特征相关性,但是它们可能会发现难以很好地估计特征冗余。这是由于候选特征和预选子集的特征之间的个体估计所致。为了解决这个限制,本文介绍了一种新颖的特征选择方法,称为模糊联合共同信息(FJMI)。所提出的方法还克服了在处理连续特征而无信息丢失的情况下的常见局限性,并且自动返回了最佳特征子集而没有用户定义的阈值。为了评估所提出方法的有效性,将FJMI与六种常​​规和最先进的特征选择方法进行了比较。在中等大小的16个基准数据集上的实验结果表明,该方法在分类性能,特征选择稳定性和选定特征数量方面有希望的改进。为了评估所提出方法的有效性,将FJMI与六种常​​规和最先进的特征选择方法进行了比较。在中等大小的16个基准数据集上的实验结果表明,该方法在分类性能,特征选择稳定性和选定特征数量方面有希望的改进。为了评估所提出方法的有效性,将FJMI与六种常​​规和最先进的特征选择方法进行了比较。在中等大小的16个基准数据集上的实验结果表明,该方法在分类性能,特征选择稳定性和选定特征数量方面有希望的改进。

更新日期:2021-03-01
down
wechat
bug