Multi-label feature selection via manifold regularization and dependence maximization
Introduction
In the traditional supervised single-label learning, each instance only belongs to one class label. However, in many real applications, it is more appropriate for an instance to be associated with multiple class labels since the instance can contain multiple semantic meanings at the same time [1,2]. In those cases, multi-label leaning is needed to assign multiple labels to each instance. The idea of multi-label learning is first used in the field of text categorization, where a document is related to several topics simultaneously [3,4]. Since then, it has attracted much interest and has been applied into a wide range of applications, such as automatic image annotation [5], music emotions classification [6] and bioinformatics [7].
Similar to the single-label learning, multi-label learning also has to face the problem of curse of dimensionality since multi-label data sets usually contain instances with a large number of features [8,9]. Many of these features are irrelevant and redundant. They not only increase the computational costs of time and space, but also lead to poor classification performance. Therefore, it is necessary to remove these features through dimensionality reduction. Feature extraction and feature selection are two main solutions to dimensionality reduction. The former maps the original feature space to a lower-dimensional subspace [10], [11], [12] while the latter directly selects a small feature subset from the whole feature set [13], [14], [15], [16]. The difference between the two solutions is that feature extraction creates new features that are uninterpretable while feature selection preserves the physical meanings of original features. Hence, feature selection has attracted more and more attentions, and has shown effectiveness of improving the performance of multi-label learning algorithms.
Recently, the multi-label feature selection methods based on the sparse regression model have been proven efficient [17], [18], [19], [20], [21]. In the methods, a least square regression model with a sparse regularization term and some other constraint conditions is usually considered. The importance of each feature can be scored based on the regression coefficients. However, most of these methods assume that there is a linear relationship between the data space and label space, and thus the class labels are linearly regressed on the original data. Unfortunately, the assumption is usually not valid.
According to spectral regression (SR) [22] which combines spectral graph and sparse regression for subspace learning, each sample is regressed to its own manifold structure. This idea has been adopted in some manifold-based feature selection methods [14, 19, 23, 24]. In the paper, our proposed method, named multi-label feature selection via manifold regularization and dependence maximization (MRDM), is also based on the SR model, but two constraints are added. One is dependence constraint between the low-dimensional embedding and the associated class labels, and the other is a structure constraint between the embedding and the original data. In MRDM, Hilbert-Schmidt Independence Criterion (HSIC) is applied as the measurement of the dependence due to its simplicity and neat theoretical properties [25], and Graph Laplacian is calculated to characterize the nonlinear geometric structure of data. The most representative features are selected through sparsity regularization and the above two constraints of dependence and structure. The main contributions of our work are summarized as follows:
- •
Presenting a new multi-label feature selection method that efficiently combines manifold regularization and dependence maximization.
- •
Introducing a HSIC-based measurement to evaluate the dependence between the manifold space and label space.
- •
Developing an iterative optimization method to solve the objective function of our method MRDM with good convergence.
- •
Conducting extensive experiments on various multi-label data sets to demonstrate the superiority of the proposed method.
The rest of this paper is organized as follows. A brief review of related works is given in Section 2, and the details of the proposed method is presented in Section 3. In Section 4, the experimental results are analyzed. Finally, Section 5 concludes and gives some issues for the future work.
Section snippets
Related work
Feature Selection is considered to be an effective tool to solve the problem of “curse of dimensionality” and to improve the classification performance of multi-label learning. Like single-label feature selection methods, multi-label feature selection methods can also be grouped into three categories [8]: filter methods, wrapper methods and embedded methods. Among them, wrapper methods apply certain search strategies to obtain feature subsets and evaluate them by using the classifier that will
Proposed approach
In this section, we propose a new multi-label feature selection method via manifold regularization and dependence maximization, namely MRDM. First, we summarize the symbols used in our paper briefly. Then a detailed description of the proposed algorithm is presented. Finally, we introduce an effective solution to the optimization problem of this algorithm.
Experiments
In this section, we compare our method MRDM with eight state-of-the-art multi-label feature selection methods and Baseline method (without feature selection) on eight commonly used multi-label data sets.
Results and discussion
The experimental results of MRDM and other comparing algorithms on the eight data sets in terms of the five evaluation metrics are given in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8-. The horizonal axis indicates the different numbers of selected features and the vertical axis indicates the classification performance on different evaluation criteria. From the figures, we can see that the performances of all the feature selection methods are generally improved with the
Conclusion
In this paper, we propose a novel multi-label feature selection method via manifold regularization and dependence maximization (MRDM). In some sparse feature selection methods, the data space is directly mapped into the label space through a coefficient matrix, which is inappropriate due to the lack of linear relationship between data space and label space in most cases. Inspired by the spectral regression, we propose to replace the label space with a low-dimensional manifold embedding. This
Declaration of Competing Interest
None.
Rui Huang received her Ph.D. degree in the School of Electronic and Information from Northwestern Polytechnical University in 2006. Currently, she is an associate professor at the School of Communication and Information Engineering, Shanghai University, China. Her research areas are artificial intelligence and machine learning.
References (45)
- et al.
Multi-label text categorization using l21-norm minimization extreme learning machine
Neurocomputing
(2017) - et al.
Image multi-label annotation based on supervised nonnegative matrix factorization with new matching measurement
Neurocomputing
(2017) - et al.
Multi-label maximum entropy model for social emotion classification over short text
Neurocomputing
(2016) - et al.
A systematic review of multi-label feature selection and a new method based on label construction
Neurocomputing
(2016) - et al.
Mutual information-based feature selection for multi-label classification
Neurocomputing
(2013) - et al.
Multi-label feature selection with shared common mode
Pattern Recognit.
(2020) - et al.
Manifold regularized discriminative feature selection for multi-label learning
Pattern Recognit.
(2019) - et al.
Feature selection for multi-label naive Bayes classification
Inf. Sci.
(2009) - et al.
Feature selection for multi-label classification using multivariate mutual information
Pattern Recognit. Lett.
(2013) - et al.
Multi-label feature selection based on max-dependency and min-redundancy
Neurocomputing
(2015)
Mutual information based multi-label feature selection via constrained convex optimization
Neurocomputing
Manifold-based constraint Laplacian score for multi-label feature selection
Pattern Recognit. Lett.
ML-KNN: a lazy learning approach to multi-label learning
Pattern Recognit.
Mining multi-label data
Data Mining and Knowledge Discovery Handbook
A review on multi-label learning algorithms
IEEE Trans. Knowl. Data Eng.
Boostexter: a boosting-based system for text categorization
Mach. Learn.
iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals
Bioinformatics
Recent advances in feature selection and its applications
Knowl. Inf. Syst.
Multi-label informed latent semantic indexing
Multilabel dimensionality reduction via dependence maximization
ACM Trans. Knowl. Discov. Data
Multi-label linear discriminant analysis
Laplacian score for feature selection
Cited by (46)
Multi-label feature selection with global and local label correlation
2024, Expert Systems with ApplicationsMulti-label feature selection via latent representation learning and dynamic graph constraints
2024, Pattern RecognitionMulti-label feature selection via adaptive dual-graph optimization
2024, Expert Systems with ApplicationsMulti-target HSIC-Lasso via exploiting target correlations
2024, Expert Systems with ApplicationsSparse low-redundancy multilabel feature selection based on dynamic local structure preservation and triple graphs exploration
2024, Expert Systems with ApplicationsDiscriminative multi-label feature selection with adaptive graph diffusion
2024, Pattern Recognition
Rui Huang received her Ph.D. degree in the School of Electronic and Information from Northwestern Polytechnical University in 2006. Currently, she is an associate professor at the School of Communication and Information Engineering, Shanghai University, China. Her research areas are artificial intelligence and machine learning.
Zhejun Wu received his B.E degree from Shanghai University in 2018. Now, he is working toward MS degree in the School of Communication and Information Engineering, Shanghai University, China. His research focuses on multi-label feature selection.