当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Relevance assignation feature selection method based on mutual information for machine learning
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2020-09-21 , DOI: 10.1016/j.knosys.2020.106439
Liyang Gao , Weiguo Wu

With the complication of the subjects and environment of the machine learning, feature selection methods have been used more frequently as an effective mean of dimension reduction. However, existing feature selection methods are deficient in striking a balance between the relevance evaluation accuracy with the searching efficiency. In this regard, the characteristics of the relevance between the feature set and the classification result are analyzed. Then, we propose our Relevance Assignation Feature Selection (RAFS) method based on the mutual information theory, which assigns the relevance evaluation according to the redundancy. With this method, we can estimate the contribution of each feature in a feature set, which is regarded as value of the feature and is used as the heuristic index in searching of the relevant features. A special dataset (“Grid World”) with strong interactive features is designed. Using the Grid World and six other natural datasets, the proposed method is compared with six other feature selection methods. Results show that in the Grid World dataset, the RAFS method can find correct relevant features with the probability above 90%, much higher than the others. In six other datasets, the RAFS method also has the best performance in the classification accuracy.



中文翻译:

基于互信息的机器学习相关性分配特征选择方法

随着学科和机器学习环境的复杂化,特征选择方法已被更频繁地用作降维的有效手段。然而,现有的特征选择方法不足以在相关性评估精度和搜索效率之间取得平衡。在这方面,分析了特征集和分类结果之间的相关性特征。然后,基于互信息理论,提出了相关性分配特征选择(RAFS)方法,根据冗余度分配了相关性评估。通过这种方法,我们可以估计特征集中每个特征的贡献,该贡献被视为特征的值,并用作在搜索相关特征时的启发式指标。设计了具有强大交互功能的特殊数据集(“网格世界”)。使用Grid World和其他六个自然数据集,将该方法与其他六个特征选择方法进行了比较。结果表明,在Grid World数据集中,RAFS方法可以找到正确的相关特征,概率大于90%,远高于其他特征。在其他六个数据集中,RAFS方法在分类准确性上也具有最佳性能。

更新日期:2020-09-24
down
wechat
bug