A supervised multi-view feature selection method based on locally sparse regularization and block computing
Introduction
In the real world, an object is often described by multiple views. For example, a person can be characterized from audio, text, photos perspectives, and each perspective has its physical significance. An image has various heterogeneous features through different descriptors, such as RGB, LBP, HOG, SURF, and so on. In general, different views represent different aspects of an object, and can provide more information than a single view. Unlike single-view learning [24], [26] that uses the information obtained from one view, multi-view learning can integrate the strengths of multiple representations. With the rapid development of multi-view data in the past decades, multi-view learning has also received extensive research and has been applied to many tasks, such as multi-view co-training [18], multi-view classification [23], multi-view clustering [17], multi-view feature selection [16], and so on.
In the multi-view learning process, the collected multi-view dataset is apt to be high-dimensional, which is prone to dimension disasters. Hence, it is necessary to remove redundant features in multi-view data. Dimensionality reduction is an essential means of data preprocessing, which can remove redundant features and reduce the training time of classifiers. Feature selection and feature extraction [31] are two classical feature dimensionality reduction techniques, and feature selection selects the optimal subset from the original features directly, which has the advantage of not changing feature semantics. So, the multi-view feature selection has received wide attention.
The multi-view feature selection methods can be divided into supervised ones, semi-supervised ones, and unsupervised ones according to the principle of how much label information has been used during the feature selection. The supervised method can use the discriminant information in all labels, making it easier to select discriminant features. So, we focus on the supervised multi-view feature selection in this paper. Xiao et al. [27] proposed the feature selection methods based on two views. Subsequently, many multi-view feature selection methods [7], [13], [14], [15], [25], [30], [32], [35] have been proposed and achieved good performance. However, the most existing supervised multi-view methods integrate the features in different view spaces into a long vector to perform feature selection, which will lead to terribly high computational complexity and bring a big challenge when handling large-scale and high-dimensional datasets.
To rapidly and efficiently perform feature selection for the large-scale and high-dimensional datasets, we propose a supervised multi-view feature selection method based on locally sparse regularization and block computing. Specifically, for the samples of each class, we establish a sharing sub-model that contains the locally sparse regularizers of views and the shared loss to perform feature selection. In the locally regularizer of each view, the -norm individual regularizer and group regularizer are adopted. On the one hand, minimizing the -norm individual regularizer can make the rows of transformation sub-matrix corresponding to the non-essential features tend to zeros. So the relevant features of each view to the classification task can be selected. On the other hand, the group regularizer can help us determine whether a view contributes to a class. Specifically, if a view has little relevance to a class, its corresponding transformation vector tends to be zero. The shared loss makes all views share a common penalty that regresses samples to their labels, and besides, reduces the influence of noise and outliers on the model due to the -norm being adopted. Finally, motivated by the generalized additive model in ADMM [3], the sharing sub-models across all classes are integrated to form the final model. In addition, an optimization method is designed to realize the block separation and independent solution, which can dramatically reduce the computational complexity.
In summary, this paper makes the following contributions:
- •
A novel supervised multi-view feature selection model is proposed by the combination of all sharing sub-models in each class, which provides a block-based feature selection.
- •
Each sharing sub-model combines the locally sparse regularization terms with the shared loss to enhance the sparsity of blocks from two aspects of features and views, which fully considers the specificity of classes and the complementarity between views.
- •
The large-scale optimization problem can be decomposed into multiple separate small-scale subproblems, which can greatly reduce the computational complexity of the proposed solution algorithm.
- •
Compared with several state-of-the-art feature selection methods, the proposed method has achieved higher average classification accuracy and faster training speed, especially on large-scale and high-dimensional datasets.
The paper is organized as follows. Section 2 reviews some related work. Section 3 shows the proposed model and the solution procedure. The computational complexity of the proposed method is further analyzed. Section 4 presents the extensive experiments and analysis. At last, Section 5 makes a summary of the paper.
Section snippets
Related work
This section reviews the related work on the supervised, semi-supervised, and unsupervised matrix-based multi-view feature selection methods.
The supervised multi-view feature selection methods use all label information to perform feature selection. Wang et al. [25] adopted the -norm and -norm sparse regularizers to deal with the multi-view classification and clustering tasks. Zhang et al. [32] proposed a novel robust multi-modal sequence-based (ROMS) method by adopting the -norm
New model
Given a multi-view data matrix , it contains C classes, n samples, V views, and d dimensions. Denote the data matrix of the ith class with all V views, where is the number of samples in the ith class, represents the data block corresponding to the jth view of the ith class, and . Denote the label matrix of the samples in the ith class by , where is the label vector
Experiments
To verify the effectiveness of the proposed method, we conduct a series of experiments on multiple benchmark datasets. First, we introduce the relevant experimental settings, including detailed dataset information, several state-of-the-art comparison methods, classification methods, and experimental parameter settings. Then, we carefully analyze the performance of these methods from the aspects of accuracy and training time. Finally, we introduce the parameter sensitivity and convergence
Conclusion
In this paper, an efficient multi-view feature selection method that can realize locally sparse regularization and block computing has been proposed. The proposed model comprises the sharing sub-models of all classes, which consider the specificity of each class and the complementarity between views. The sub-models contain locally sparse regularizes and the shared losses of all classes. The locally sparse regularizer can enhance the sparsity of features and views. The shared loss can make all
CRediT authorship contribution statement
Qiang Lin: Data curation, Methodology, Formal analysis, Software, Visualization. Min Men: Validation, Writing - original draft. Liran Yang: Formal analysis, Writing - review & editing. Ping Zhong: Supervision, Conceptualization, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors would like to thank the reviewers for their valuable comments and suggestions to improve the quality of this paper.
References (36)
- et al.
Multi-view feature selection via nonnegative structured graph learning
Neurocomputing
(2020) - et al.
Semi-supervised multi-view maximum entropy discrimination with expectation Laplacian regularization
Inf. Fusion
(2019) - et al.
A novel low-rank hypergraph feature selection for multi-view classification
Neurocomputing
(2017) - et al.
Robust unsupervised feature selection via matrix factorization
Neurocomputing
(2017) - et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power
Inf. Sci.
(2010) - et al.
Learning robust affinity graph representation for multi-view clustering
Inf. Sci.
(2021) - et al.
Robust feature selection via simultaneous capped norm and sparse regularizer minimization
Neurocomputing
(2018) - et al.
A sharing multi-view feature selection method via Alternating Direction Method of Multipliers
Neurocomputing
(2019) - et al.
Multi-view subspace clustering via partition fusion
Inf. Sci.
(2021) - et al.
Semi-supervised feature selection analysis with structured multi-view sparse regularization
Neurocomputing
(2019)