The similarity-consensus regularized multi-view learning for dimension reduction

https://doi.org/10.1016/j.knosys.2020.105835Get rights and content

Abstract

During the last decades, learning a low-dimensional space with discriminative information for dimension reduction (DR) has gained a surge of interest. However, it is not accessible for these DR methods to achieve satisfactory performance when dealing with the features from multiple views. For multi-view learning problems, one instance can be represented by multiple heterogeneous features, which are highly relevant but sometimes look different from each other. In addition, correlations between features from multiple views always vary greatly, which challenges the capability of multi-view learning methods. Consequently, constructing a multi-view learning framework with generalization and scalability, which could take advantage of multi-view information as much as possible, is extremely necessary but challenging. To implement the above target, this paper proposes a novel multi-view learning framework based on similarity consensus, which makes full use of correlations among multi-view features while considering the scalability and robustness of the framework. It aims to straightforwardly extend those existing DR methods into multi-view learning domain by preserving the similarity consensus between different views to capture the low-dimensional embedding. Two schemes based on pairwise-consensus and centroid-consensus are separately proposed to force multiple views to learn from each other, then an iterative alternating strategy is developed to obtain the optimal solution. The proposed method is evaluated on 5 benchmark datasets and comprehensive experiments show that our proposed multi-view framework can yield comparable and promising performance with some famous methods.

Introduction

Raw data is often collected from different kinds of viewpoints [1], [2], [3], [4] in many real-world applications, such as image retrieval [5], [6], text categorization [7], [8] and face recognition [9], [10]. For example, web pages usually consist of title, page-text and hyperlink information. Similarly, an image could be described with color, text or shape information, such as HSV, Local Binary Pattern (LBP) [11], Gist [12], Histogram of Gradients (HoG) [13], and Edge Direction Histogram (EDH) [14]. Different from single view data which only contains partial information, multi-view data usually carries complementary information among different views. Even though multi-view features usually contain more useful information than single view scenario, high dimensional problems and integration among different views influence the efficiency and performance of the application system. To address the above-mentioned issues, most DR methods and multi-view learning methods are proposed.

To tackle time consuming and computational cost caused by high dimensional features, a variety of DR methods are proposed to find a low dimensional space by preserving some properties of raw features. Existing DR methods could be mainly divided into three categories: subspace learning [10], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], kernel learning [26], [27], [28] and manifold learning [29], [30], [31], [32]. There are two classical subspace learning methods including Principal Components Analysis (PCA) [16] and Linear Discriminant Analysis (LDA) [17], [18], which are based on linear transform. And PCA and LDA maximize the global variance of low dimensional features and the ratio between between-class scatter and within-class scatter respectively to obtain the low-dimensional embedding separately. Compared with the above two methods that only maintain global structure of original data, some DR methods aim to find an optimal subspace while could preserve the local relations among different samples by different means, such as Locality Preserving Projection (LPP) [19], Neighborhood Preserving Embedding (NPE) [20], Locality Semantic Discriminant Analysis (LSDA) [21], Large Margin Nearest Neighbor (LMNN) [22], Margin Fisher Analysis (MFA) [23], and Sparsity Preserving Projections (SPP) [10]. Unlike these linear methods above, kernel methods aim to find a low dimensional space in nonlinearly high-dimensional space by using kernel tricks. There are some representative works including Kernel Principal Components Analysis (KPCA) [26], Kernel Fisher Discriminant Analysis (KFDA) [27], and Kernel Large Margin Component Analysis (KLMCA) [28], which extend PCA, LDA and LMNN into nonlinear subspace learning domain respectively. Except for kernel methods above, manifold learning has shown its effectiveness in the process of nonlinear dimension reduction, which learns an embedded low-dimensional manifold through preserving local geometric information of the original high dimensional space. Representative manifold learning algorithms include Isometric Mapping (Isomap) [29], Laplacian Embedding (LE) [30], Locally Linear Embedding (LLE) [31], and Local Tangent Space Alignment (LSTA) [32]. These DR methods above mainly focus on single view, and could not be directly extended to process multi-view cases.

On integrating rich information among different viewpoints, a variety of multi-view learning methods [33], [34], [35], [36], [37], [38], [39] has been proposed in the past decade. It has been verified [34] that Canonical Correlation Analysis (CCA) [40] could be used to project the two views into the common subspace by maximizing the cross correlation between them. Furthermore, CCA is further generalized for multi-view scenario termed as multi-view canonical correlation analysis (MCCA) [35]. Multi-View Discriminant Analysis [36] is proposed to extend LDA into a multi-view setting, which projects multi-view features to one discriminative common subspace. The paper [37] proposes a Generalized Latent Multi-View Subspace Clustering, which jointly learns the latent representation and multi-view subspace representation within the unified framework. Besides these multi-view learning methods, some researches based on multi-graph learning have been developed. Multiview Spectral Embedding (MSE) [38] incorporates conventional algorithms with multi-view data to find a common low-dimensional subspace, which exploits low-dimensional representations based on graph. The work [39] aims to propose a co-regularized multi-view spectral clustering framework that captures complementary information among different viewpoints by co-regularizing a clustering hypothesis. In addition to these works mentioned above, such works [41], [42], [43], [44], [45], [46], [47], [48], [49], [50] also obtain promising performance in multi-view learning environment. For example, COMIC [41] projects each data point into a space in which two properties are satisfied, i.e., geometric consistency and cluster assignment consistency which are specifically designed for different goals. Even though these methods have made well progress in integrating multi-view information, limitations of generalization and scalability exist all the time.

In real world, one object usually could be described from different views, which arises high dimensional data processing problems and integration difficulty among different views. As a famous family of high dimensional data processing methods, DR methods have attracted wide attention due to their excellent performance. At the aspect of integration among different views, some multi-view methods are proposed to achieve considerable performance but the existing limitations of generalization and scalability could not be neglected. It is very difficult for these works to extend DR methods based on single view into multi-view setting, so we could not take full advantage of these works when integrating rich information among different views.

Based on the discussion above, we decide to comprehensively investigate the multi-view learning problem from the following 2 aspects:

  • Is it feasible to extend these DR methods, including subspace learning, kernel learning, and manifold learning, to process multi-view problems together?

  • How to integrate multi-view features from different features under different views?

In terms of the considerations above, this paper first proposes a novel multi-view learning framework based on similarity consensus for manifold learning methods maintaining local linear structure in the geometric manifold space. Especially, the correlation between similarity matrices of two views is utilized as the consensus term and then the pairwise consensus-based framework and the centroid consensus-based framework are designed to integrate different information from multiple views respectively. Then, an optimization algorithm using iterative alternating strategy is developed to obtain the optimal solution of the framework. Furthermore, we extend the framework for subspace learning and kernel learning so that most DR methods based on single view could be extended to achieve the dimension reduction for multi-view features. Finally, the experiments are conducted on 5 benchmark datasets. To sum up, the contributions in this paper are illustrated as follows:

  • A novel multi-view manifold learning framework based on similarity consensus is proposed to integrate different information from multiple views. Furthermore, we propose an effective and robust iterative method to seek an optimal solution for the multi-view framework.

  • It’ not difficult to find that most single view-based subspace learning and kernel learning methods could be cast as a special form of the quadratically constrained quadratic program and we extend these works into our multi-view framework in this paper.

  • The experimental results on 5 benchmark datasets demonstrate that the proposed method outperforms its counterparts and achieves comparable performance.

The rest of the paper is organized as follows. In Section 2, we provide several related methods which have attracted extensive attention. In Section 3, we describe the construction procedure of the multi-view learning framework based on similarity consensus for manifold learning and illustrate the optimization algorithm in detail. In Section 4, we extend subspace learning and kernel learning methods based on similarity consensus into the multi-view framework. In Section 5, empirical evaluations based on the applications of text classification and image classification demonstrate the effectiveness of our proposed approach. In Section 6, we make a conclusion of this paper.

Section snippets

Related works

In this section, we first introduce one representative single view-based manifold learning method called Locally Linear Embedding. Then, we review two extensively concerned multi-view learning methods, including Canonical Correlation Analysis [40] and Co-regularized multi-view spectral clustering [39].

The similarity-consensus regularized multi-view manifold learning

In this section, we will introduce our similarity-consensus regularized multi-view manifold learning framework in detail. Firstly, we briefly recall the definition of manifold learning methods based on single view and introduce background descriptions towards multi-view manifold learning problems. Then, we utilize the agreement between similarity matrices of two views as the consensus term and propose a multi-view framework based on such consensus terms among pairwise views, which not only

Extensions

In this section, we extend subspace learning and kernel learning methods based on similarity consensus into the multi-view framework. Most subspace learning methods could be cast as a special form of QCQP. Specially, the linear projection for the vth view could be obtained as maxWvtr(WvTAvWv)s.t.WvTBvWv=Iwhere WvRDv×dv denotes the projection matrix for the vth view, Av is some symmetric square matrix and Bv is a square symmetric definite matrix. Methods that fit this equation include PCA [16],

Experiments

In this section, we evaluate the performance of our framework by comparing it with several classical DR methods and multi-view learning methods in the multi-view datasets of texts and images. Firstly, we introduce the details of the used datasets and comparing methods in Section 5.1. Then, the experiments on five multi-view datasets are shown in Sections 5.2 Experiments on textual datasets, 5.3 Experiments on images datasets. Finally, we discuss the convergence of our method by summarizing the

Conclusion

In this paper, a novel multi-view learning framework based on similarity consensus is proposed with strong robustness and scalability, which better makes full use of complementary information among multi-view features. Moreover, it is such a flexible and scalable framework that aims to straightforwardly extend these existing single view-based learning methods that could be cast as the special form of QCQP, including subspace learning methods, kernel methods, and manifold learning methods, into

CRediT authorship contribution statement

Xiangzhu Meng: Conceptualization, Methodology, Software, Writing - review & editing. Huibing Wang: Data curation, Visualization. Lin Feng: Supervision, Investigation, Conceptualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The authors would like to thank the anonymous reviewers for their insightful comments and suggestions to significantly improve the quality of this paper. This work was supported by National Natural Science Foundation of PR China (61672130, 61972064), the Postdoctoral Science Foundation, PR China (3620080307) and Liaoning Revitalization Talents Program, PR China (XLYC1806006).

Xiangzhu Meng received his BS degree from Anhui University, in 2015. Now he is working towards the PHD degree in School of Computer Science and Technology, Dalian University of Technology, China. His research interests include multi-view learning, deep learning and computing vision.

References (54)

  • LiY. et al.

    A survey of multi-view representation learning

    IEEE Trans. Knowl. Data Eng.

    (2018)
  • LahatD. et al.

    Multimodal data fusion: an overview of methods, challenges, and prospects

    Proc. IEEE

    (2015)
  • SunS.

    A survey of multi-view machine learning

    Neural Comput. Appl.

    (2013)
  • XuC. et al.

    A survey on multi-view learning

    (2013)
  • SmeuldersA.W. et al.

    Content-based image retrieval at the end of the early years

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • DattaR. et al.

    Image retrieval: Ideas, influences, and trends of the new age

    ACM Comput. Surv.

    (2008)
  • JiangJ.-Y. et al.

    A fuzzy self-constructing feature clustering algorithm for text classification

    IEEE Trans. Knowl. Data Eng.

    (2011)
  • OjalaT. et al.

    Multiresolution gray-scale and rotation invariant texture classification with local binary patterns

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • DouzeM. et al.

    Evaluation of gist descriptors for web-scale image search

  • DalalN. et al.

    Histograms of oriented gradients for human detection

  • FukunagaK.

    Introduction to Statistical Pattern Recognition

    (2013)
  • YuS. et al.

    Robust linear discriminant analysis with a Laplacian assumption on projection distribution

  • DudoitS. et al.

    Comparison of discrimination methods for the classification of tumors using gene expression data

    J. Amer. Statist. Assoc.

    (2002)
  • HeX. et al.

    Locality preserving projections

  • HeX. et al.

    Neighborhood preserving embedding

  • D. Cai, X. He, K. Zhou, J. Han, H. Bao, Locality sensitive discriminant analysis, in: IJCAI, vol. 2007, 2007, pp....
  • WeinbergerK.Q. et al.

    Distance metric learning for large margin nearest neighbor classification

  • Cited by (0)

    Xiangzhu Meng received his BS degree from Anhui University, in 2015. Now he is working towards the PHD degree in School of Computer Science and Technology, Dalian University of Technology, China. His research interests include multi-view learning, deep learning and computing vision.

    Huibing Wang received the Ph.D. degree in the School of Computer Science and Technology, Dalian University of Technology, Dalian, in 2018. During 2016 and 2017, he is a visiting scholar at the University of Adelaide, Adelaide, Australia. Now, he is a postdoctor in Dalian Maritime University, Dalian, Liaoning, China. He has authored and co-authored more than 20 papers in some famous journals or conferences, including TMM, TITS, TSMCS, ECCV, etc. Furthermore, he serves as reviewers for TOIS, TNNLS, Neurocomputing, PR Letters and MTAP, etc. His research interests include computing vision and machine learning

    Lin Feng received the BS degree in electronic technology from Dalian University of Technology, China, in 1992, the MS degree in power engineering from Dalian University of Technology, China, in 1995,and the PhD degree in mechanical design and theory from Dalian University of Technology, China, in 2004. He is currently a professor and doctoral supervisor in the School of Innovation Experiment, Dalian University of Technology, China. His research interests include intelligent image processing, robotics, data mining, and embedded systems.

    View full text