Elsevier

Knowledge-Based Systems

Volume 212, 5 January 2021, 106615
Knowledge-Based Systems

Incomplete multi-view clustering with partially mapped instances and clusters

https://doi.org/10.1016/j.knosys.2020.106615Get rights and content

Abstract

Most multi-view clustering methods assume that each view has complete instances and clusters. However, in real world applications, the instances or clusters may be missed in some views. Recently, multi-view clustering on data with partially mapped instances has been studied. In this paper, we study the multi-view clustering on data with partially mapped instances and clusters to extend the application of multi-view clustering. We propose a NMF (Non-negative Matrix Factorization) based algorithm which separately deals with the mapped clusters/instances and the individual clusters/instances, i.e., both the basis matrix and the indicator matrix consist of a mapped part and an individual part. By bounding the mapped instances to reduce to the same indicator vectors, the mapped instances and clusters connect multiple views and guide to find the indicator vectors of all the instances. Furthermore, we improve the algorithm by using locally geometrical information to reduce the negative impact caused by multi-view interaction. Experiments show that the proposed algorithms perform well on data with partially mapped instances and clusters.

Introduction

Multi-view clustering has become a hot yet still challenging research topic and many algorithms [1], [2], [3], [4] have been proposed since the past decade. Most existing algorithms are based on an ideal assumption that the instances and clusters are completely mapped, i.e., the cluster and instance are the same in different views (Fig. 1(a)). However, in real-life applications, these assumptions may be not satisfied.

Recently, some studies try to release these assumptions. Some algorithms [5], [6], [7], [8] only require that the instances are partial(Fig. 1(b)), and the CMVNMF algorithm in [9] does not require instance mapping, but they assume that the clusters are complete. The CGC algorithm [10] does not require cluster mapping, but its formulation requires that each instance in one view is mapped to instances in another view at a certain probability.

In real-life applications, multi-view data may have relationship as shown in Fig. 1(c). That is, the instances in different views are partially mapped; the clusters in different views are partially mapped; the instances in a mapped cluster are not the same in different views. For example, a piece of news may be reported by many news agencies from different views, thus news could be clustered from multiple views. However, a piece of news may be not reported by all the concerned news agencies and then instances are partially mapped; different news agencies may have diverse sections and then clusters are partially mapped; different news agencies may report diverse news in a same section and then the instances in a mapped cluster are not the same in different views.

In this paper, we propose a Multi-View clustering algorithm for Partially mapped cluster and instance (MVP) in the framework of NMF (Nonnegative Matrix Factorization) based clustering. MVP divides both the basis matrix and the indicator matrix into a mapped part and an individual part. Since the instances representing the same object belong to the same cluster, the mapped instances belong to the mapped clusters. Also, the instances representing the same object could be represented by the same indicator vector, thus the mapped instances connect multiple views and guide to find the indicator vector of each instance. Furthermore, we propose the GMVP algorithm (Geometrically improved MVP) which improves MVP by using locally geometrical information to reduce the negative impact caused by multi-view interaction. Specifically, the mapped indicator vectors are adjusted by those of the neighbors. Experimental results show that MVP achieves good performance on data with partially mapped instances and clusters, and GMVP further improves the clustering performance.

To sum up, the major contributions are highlighted as:

  • We propose the MVP algorithm to cluster multi-view data with partially mapped instances and clusters.

  • We propose the GMVP algorithm to further improve MVP using local neighbors.

Section snippets

Related work

Recently, many multi-view clustering algorithms have been proposed and NMF [11] based multi-view clustering algorithms have demonstrated their superiorities.

Algorithms for completely mapped data. [3] formulated a joint matrix factorization process with the regularization that pushed clustering solution of each view towards a common consensus. [12] and [13] enforced a shared coefficient matrix among different views, which ensured the consistency of multiple matrix factorizations. [14] clustered

Problem definition

To facilitate the discussion, we set the number of views as 2.1 Denote {xiv}i=1nˆv as a two view dv-dimensional dataset, where v{1,2} and nˆv is the number of instances in the vth view. The dataset contains two parts Xpv

Datasets

The proposed algorithms are for clustering multi-view data with incomplete instances and incomplete clusters. When we run the algorithms, we should input the set of mapped instance, the set of individual instances, the number of mapped clusters and the number of individual clusters in each view.

Most of the existing multi-view datasets are with complete instances and clusters. For the task of partial multi-view clustering, the common approach is to reform the existing complete datasets to

Conclusion

In this paper, we have studied the multi-view clustering problem on data with partially mapped instances and clusters. We have proposed a NMF based algorithm which separately deals with mapped clusters/instances and individual clusters/instances, and the mapped instances and clusters connect multiple views. We have further improved the algorithm by using locally geometrical information to reduce the negative impact caused by multi-view interaction. Experimental results show that the proposed

CRediT authorship contribution statement

Linlin Zong: Conceptualization, Methodology, Funding acquisition, Writing - review & editing, Writing - original draft, Formal analysis. Faqiang Miao: Methodology, Writing - original draft, Writing - review & editing. Xianchao Zhang: Supervision, Writing - review & editing, Resources. Xinyue Liu: Project administration, Writing - review & editing, Resources. Hong Yu: Writing - review & editing, Investigation, Resources.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by National Science Foundation of China (No. 61806034; No. 61876028; No. 61976037).

References (31)

  • ZongL. et al.

    Multi-view clustering via multi-manifold regularized non-negative matrix factorization

    Neural Netw. Off. J. Int. Neural Netw. Soc.

    (2017)
  • XuY.M. et al.

    Weighted multi-view clustering with feature selection

    Pattern Recognit.

    (2016)
  • OuW. et al.

    Multi-view non-negative matrix factorization by patch alignment framework with view consistency

    Neurocomputing

    (2016)
  • S. Bickel, T. Scheffer, Multi-view clustering, in: IEEE International Conference on Data Mining, 2004, pp....
  • A. Kumar, H.D. III, A co-training approach for multi-view spectral clustering, in: International Conference on Machine...
  • J. Liu, C. Wang, J. Gao, J. Han, Multi-view clustering via joint nonnegative matrix factorization, in: SIAM...
  • S.-Y. Li, Y. Jiang, Z.-H. Zhou, Partial multi-view clustering, in: Twenty-Eighth AAAI Conference on Artificial...
  • Q. Yin, S. Wu, L. Wang, Incomplete multi-view clustering via subspace learning, in: Proceedings of the 24th ACM...
  • W. Shao, L. He, P.S. Yu, Multiple incomplete views clustering via weighted nonnegative matrix factorization with l21...
  • B. Qian, X. Shen, Y. Gu, Z. Tang, Y. Ding, Double constrained NMF for partial multi-view clustering, in: International...
  • X. Zhang, L. Zong, X. Liu, H. Yu, Constrained nmf-based multi-view clustering on unmapped data, in: Twenty-Ninth AAAI...
  • W. Cheng, X. Zhang, Z. Guo, Y. Wu, P.F. Sullivan, W. Wang, Flexible and robust co-regularized multi-domain graph...
  • LeeD.D. et al.

    Algorithms for non-negative matrix factorization

  • Z. Akata, C. Thurau, C. Bauckhage, Non-negative matrix factorization in multimodality data for segmentation and label...
  • A.P. Singh, G.J. Gordon, Relational learning via collective matrix factorization, in: Proceedings of the 14th ACM...
  • Cited by (5)

    View full text