Elsevier

Information Sciences

Volume 582, January 2022, Pages 146-166
Information Sciences

A supervised multi-view feature selection method based on locally sparse regularization and block computing

https://doi.org/10.1016/j.ins.2021.09.009Get rights and content

Highlights

  • A supervised multi-view model is proposed to realize a block-based feature selection.

  • The proposed model is composed of all sharing sub-models in each class.

  • The sparse regularizer can enhance the sparsity of blocks from features and views.

  • The proposed algorithm can realize the block separation and independent solution.

  • Numerical experiments show the effectiveness of our method on large-scale datasets.

Abstract

With the increasing scale of obtained multi-view data, how to deal with large-scale multi-view data quickly and efficiently is a significant problem. In this paper, a novel supervised multi-view feature selection method based on locally sparse regularization and block computing is proposed to solve the problem. Specifically, the multi-view dataset is firstly divided into sub-blocks according to classes and views. Then with the aid of the Alternating Direction Method of Multipliers (ADMM), a sharing sub-model is proposed to perform feature selection on each class by integrating each view’s locally sparse regularizers and shared loss that makes all views share a common penalty and regresses samples to their labels. Finally, all the sharing sub-models are fused to form the final general additive feature selection model, in which each sub-block adjusts its corresponding variables to perform block-based feature selection. In the optimization process, the proposed model can be decomposed into multiple separate subproblems, and an efficient optimization algorithm is proposed to solve them quickly. The comparison experiments with several state-of-the-art feature selection methods show that the proposed method is superior in classification accuracy and training speed.

Introduction

In the real world, an object is often described by multiple views. For example, a person can be characterized from audio, text, photos perspectives, and each perspective has its physical significance. An image has various heterogeneous features through different descriptors, such as RGB, LBP, HOG, SURF, and so on. In general, different views represent different aspects of an object, and can provide more information than a single view. Unlike single-view learning [24], [26] that uses the information obtained from one view, multi-view learning can integrate the strengths of multiple representations. With the rapid development of multi-view data in the past decades, multi-view learning has also received extensive research and has been applied to many tasks, such as multi-view co-training [18], multi-view classification [23], multi-view clustering [17], multi-view feature selection [16], and so on.

In the multi-view learning process, the collected multi-view dataset is apt to be high-dimensional, which is prone to dimension disasters. Hence, it is necessary to remove redundant features in multi-view data. Dimensionality reduction is an essential means of data preprocessing, which can remove redundant features and reduce the training time of classifiers. Feature selection and feature extraction [31] are two classical feature dimensionality reduction techniques, and feature selection selects the optimal subset from the original features directly, which has the advantage of not changing feature semantics. So, the multi-view feature selection has received wide attention.

The multi-view feature selection methods can be divided into supervised ones, semi-supervised ones, and unsupervised ones according to the principle of how much label information has been used during the feature selection. The supervised method can use the discriminant information in all labels, making it easier to select discriminant features. So, we focus on the supervised multi-view feature selection in this paper. Xiao et al. [27] proposed the feature selection methods based on two views. Subsequently, many multi-view feature selection methods [7], [13], [14], [15], [25], [30], [32], [35] have been proposed and achieved good performance. However, the most existing supervised multi-view methods integrate the features in different view spaces into a long vector to perform feature selection, which will lead to terribly high computational complexity and bring a big challenge when handling large-scale and high-dimensional datasets.

To rapidly and efficiently perform feature selection for the large-scale and high-dimensional datasets, we propose a supervised multi-view feature selection method based on locally sparse regularization and block computing. Specifically, for the samples of each class, we establish a sharing sub-model that contains the locally sparse regularizers of views and the shared loss to perform feature selection. In the locally regularizer of each view, the l2,1-norm individual regularizer and l1 group regularizer are adopted. On the one hand, minimizing the l2,1-norm individual regularizer can make the rows of transformation sub-matrix corresponding to the non-essential features tend to zeros. So the relevant features of each view to the classification task can be selected. On the other hand, the l1 group regularizer can help us determine whether a view contributes to a class. Specifically, if a view has little relevance to a class, its corresponding transformation vector tends to be zero. The shared loss makes all views share a common penalty that regresses samples to their labels, and besides, reduces the influence of noise and outliers on the model due to the l2,1-norm being adopted. Finally, motivated by the generalized additive model in ADMM [3], the sharing sub-models across all classes are integrated to form the final model. In addition, an optimization method is designed to realize the block separation and independent solution, which can dramatically reduce the computational complexity.

In summary, this paper makes the following contributions:

  • A novel supervised multi-view feature selection model is proposed by the combination of all sharing sub-models in each class, which provides a block-based feature selection.

  • Each sharing sub-model combines the locally sparse regularization terms with the shared loss to enhance the sparsity of blocks from two aspects of features and views, which fully considers the specificity of classes and the complementarity between views.

  • The large-scale optimization problem can be decomposed into multiple separate small-scale subproblems, which can greatly reduce the computational complexity of the proposed solution algorithm.

  • Compared with several state-of-the-art feature selection methods, the proposed method has achieved higher average classification accuracy and faster training speed, especially on large-scale and high-dimensional datasets.

The paper is organized as follows. Section 2 reviews some related work. Section 3 shows the proposed model and the solution procedure. The computational complexity of the proposed method is further analyzed. Section 4 presents the extensive experiments and analysis. At last, Section 5 makes a summary of the paper.

Section snippets

Related work

This section reviews the related work on the supervised, semi-supervised, and unsupervised matrix-based multi-view feature selection methods.

The supervised multi-view feature selection methods use all label information to perform feature selection. Wang et al. [25] adopted the G1-norm and l2,1-norm sparse regularizers to deal with the multi-view classification and clustering tasks. Zhang et al. [32] proposed a novel robust multi-modal sequence-based (ROMS) method by adopting the G1-norm

New model

Given a multi-view data matrix X=X1T,,XiT,,XCTTRn×d, it contains C classes, n samples, V views, and d dimensions. Denote Xi=Xi1,,Xij,,XiVRni×d the data matrix of the ith class with all V views, where ni is the number of samples in the ith class, XijRni×dj represents the data block corresponding to the jth view of the ith class, and j=1Vdj=d. Denote the label matrix of the samples in the ith class by Yi=[Yi,1:,,Yi,q:,,Yi,ni:]TRni×C, where Yi,q:=[yiq1,,yiqC]T is the label vector

Experiments

To verify the effectiveness of the proposed method, we conduct a series of experiments on multiple benchmark datasets. First, we introduce the relevant experimental settings, including detailed dataset information, several state-of-the-art comparison methods, classification methods, and experimental parameter settings. Then, we carefully analyze the performance of these methods from the aspects of accuracy and training time. Finally, we introduce the parameter sensitivity and convergence

Conclusion

In this paper, an efficient multi-view feature selection method that can realize locally sparse regularization and block computing has been proposed. The proposed model comprises the sharing sub-models of all classes, which consider the specificity of each class and the complementarity between views. The sub-models contain locally sparse regularizes and the shared losses of all classes. The locally sparse regularizer can enhance the sparsity of features and views. The shared loss can make all

CRediT authorship contribution statement

Qiang Lin: Data curation, Methodology, Formal analysis, Software, Visualization. Min Men: Validation, Writing - original draft. Liran Yang: Formal analysis, Writing - review & editing. Ping Zhong: Supervision, Conceptualization, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to thank the reviewers for their valuable comments and suggestions to improve the quality of this paper.

References (36)

Cited by (0)

View full text