当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel approach to attribute reduction based on weighted neighborhood rough sets
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2021-02-27 , DOI: 10.1016/j.knosys.2021.106908
Meng Hu , Eric C.C. Tsang , Yanting Guo , Degang Chen , Weihua Xu

Neighborhood rough sets based attribute reduction, as a common dimension reduction method, has been widely used in machine learning and data mining. Each attribute has the same weight (the degree of importance) in the existing neighborhood rough set models. In this work, we introduce different weights into neighborhood relations and propose a novel approach for attribute reduction. The main motivation is to fully mine the correlation between attributes and decisions before calculating neighborhood relations, and the attributes with high correlation are assigned higher weights. We first construct a Weighted Neighborhood Rough Set (WNRS) model based on weighted neighborhood relations and discuss its properties. Then WNRS based dependency is defined to evaluate the significance of attribute subsets. We design a greedy search algorithm based on WNRS to select an attribute subset which has both strong correlation and high dependency. Furthermore, we use isometric search to find the optimal neighborhood threshold. Finally, ten datasets from UCI machine learning repository and ELVIRA Biomedical data set repository are used to compare the performance of WNRS with those of other state-of-the-art reduction algorithms. The experimental results show that WNRS is feasible and effective, which has higher classification accuracy and compression ratio.



中文翻译:

基于加权邻域粗糙集的属性约简新方法

作为一种常见的维数缩减方法,基于邻域粗糙集的属性约简已广泛用于机器学习和数据挖掘中。在现有的邻域粗糙集模型中,每个属性具有相同的权重(重要性程度)。在这项工作中,我们将不同的权重引入邻域关系,并提出了一种新的属性约简方法。其主要动机是在计算邻域关系之前充分挖掘属性与决策之间的相关性,并且为相关性较高的属性分配较高的权重。我们首先基于加权邻域关系构建加权邻域粗糙集(WNRS)模型并讨论其性质。然后定义基于WNRS的依存关系,以评估属性子集的重要性。我们设计了一种基于WNRS的贪婪搜索算法,以选择具有强相关性和高依赖性的属性子集。此外,我们使用等距搜索来找到最佳邻域阈值。最后,使用来自UCI机器学习存储库和ELVIRA生物医学数据集存储库的十个数据集,将WNRS的性能与其他最新的约简算法进行比较。实验结果表明,WNRS是可行和有效的,具有较高的分类精度和压缩率。使用UCI机器学习存储库和ELVIRA生物医学数据集存储库中的十个数据集,将WNRS的性能与其他最新的约简算法进行了比较。实验结果表明,WNRS是可行和有效的,具有较高的分类精度和压缩率。使用UCI机器学习存储库和ELVIRA生物医学数据集存储库中的十个数据集,将WNRS的性能与其他最新的约简算法进行了比较。实验结果表明,WNRS是可行和有效的,具有较高的分类精度和压缩率。

更新日期:2021-03-10
down
wechat
bug