Similarity-based constraint score for feature selection,Knowledge-Based Systems

当前位置： X-MOL 学术 › Knowl. Based Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Similarity-based constraint score for feature selection
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2020-09-17 , DOI: 10.1016/j.knosys.2020.106429
Abderezak Salmi , Kamal Hammouche , Ludovic Macaire

To avoid the curse of dimensionality resulting from a large number of features, the most relevant features should be selected. Several scores involving must-link and cannot-link constraints have been proposed to estimate the relevance of features. However, these constraint scores evaluate features one by one and ignore any correlation between them. In addition, they compute distance in the high-dimensional original feature space to evaluate similarity between samples. So, they would be corrupted by the curse of dimensionality. To deal with these drawbacks, we propose a new constraint score based on a similarity matrix that is computed in the selected feature subspace and that makes it possible to evaluate the relevance of a feature subset at once. Experiments on benchmark databases demonstrate the improvement brought by the proposed constraint score in the context of both supervised and semi-supervised learnings.

中文翻译：

基于相似度的约束分数用于特征选择

为了避免大量特征导致的尺寸诅咒，应选择最相关的特征。已经提出了一些涉及必须链接和不能链接约束的分数，以估计特征的相关性。但是，这些约束评分会逐一评估要素，而忽略它们之间的任何相关性。此外，它们计算高维原始特征空间中的距离，以评估样本之间的相似性。因此，它们将因维数的诅咒而被破坏。为了解决这些缺点，我们提出了一种基于相似度矩阵的新约束评分，该相似度矩阵是在所选特征子空间中计算的，从而可以立即评估特征子集的相关性。

更新日期：2020-09-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>