Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection,IEEE Transactions on Neural Networks and Learning Systems

当前位置： X-MOL 学术 › IEEE Trans. Neural Netw. Learn. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-04-22 , DOI: 10.1109/tnnls.2021.3071603
Mingyu Fan ₁ , Xiaoqin Zhang ₂ , Jie Hu ₁ , Nannan Gu ₃ , Dacheng Tao ₄

Affiliation

Feature selection (FS), which aims to identify the most informative subset of input features, is an important approach to dimensionality reduction. In this article, a novel FS framework is proposed for both unsupervised and semisupervised scenarios. To make efficient use of data distribution to evaluate features, the framework combines data structure learning (as referred to as data distribution modeling) and FS in a unified formulation such that the data structure learning improves the results of FS and vice versa. Moreover, two types of data structures, namely the soft and hard data structures, are learned and used in the proposed FS framework. The soft data structure refers to the pairwise weights among data samples, and the hard data structure refers to the estimated labels obtained from clustering or semisupervised classification. Both of these data structures are naturally formulated as regularization terms in the proposed framework. In the optimization process, the soft and hard data structures are learned from data represented by the selected features, and then, the most informative features are reselected by referring to the data structures. In this way, the framework uses the interactions between data structure learning and FS to select the most discriminative and informative features. Following the proposed framework, a new semisupervised FS (SSFS) method is derived and studied in depth. Experiments on real-world data sets demonstrate the effectiveness of the proposed method.

中文翻译：

自适应数据结构正则化多类判别特征选择

特征选择（FS）旨在识别输入特征中信息最丰富的子集，是降维的重要方法。在本文中，针对无监督和半监督场景提出了一种新颖的 FS 框架。为了有效地利用数据分布来评估特征，该框架将数据结构学习（称为数据分布建模）和FS结合在一个统一的公式中，使得数据结构学习可以改善FS的结果，反之亦然。此外，在所提出的FS框架中学习和使用了两种类型的数据结构，即软数据结构和硬数据结构。软数据结构是指数据样本之间的成对权重，硬数据结构是指通过聚类或半监督分类获得的估计标签。这两种数据结构在所提出的框架中自然地被表述为正则化项。在优化过程中，从所选特征表示的数据中学习软数据结构和硬数据结构，然后通过参考数据结构重新选择信息量最大的特征。通过这种方式，该框架利用数据结构学习和FS之间的交互来选择最具辨别力和信息量的特征。按照所提出的框架，推导并深入研究了一种新的半监督FS（SSFS）方法。对真实世界数据集的实验证明了所提出方法的有效性。

更新日期：2021-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11