Elsevier

Information Sciences

Volume 574, October 2021, Pages 176-191
Information Sciences

A synchronous feature learning method for multiplex network embedding

https://doi.org/10.1016/j.ins.2021.05.083Get rights and content

Abstract

Compared with single-layer networks, multiplex networks can describe real-world scenarios in more detail while suffering from requiring considerable computing and storage resources at the same time. Network feature learning, which aims to embed networks into a low dimensional space, is an effective method for solving these problems. Currently, research on multiplex network embedding faces two major challenges: how to make full use of the connected information in different layers and how to embed multiplex networks into a unified space. In this paper, a novel multiplex network embedding model is proposed to solve these two problems. It preserves all the first-, second- and multi-order proximities in multiplex networks by optimizing the corresponding objective functions. The network reconstruction step combines information of different types of relations in other layers while maintaining their distinctive properties. The proposed synchronous learning strategy provides a path to embed multiplex networks into a unified space. Extensive experiments on three real applications: visualization, link prediction and node classification are conducted to validate the effectiveness of the proposed method. The experimental results show that it achieves better or comparable performance compared with several state-of-the-art methods.

Introduction

Network embedding, which aims to represent nodes using dense vectors, has been studied for decades [1]. Traditional unsupervised feature learning methods mainly exploit the spectral properties of networks [2], and both linear and nonlinear methods [4] have been proposed for dimension reduction [7]. However, this kind of method consumes many statistical and computational resources, making it inefficient for handling large-scale networks. Recently, an online representation learning algorithm [10], which was inspired by word2vec [11], was proposed to learn the representation of nodes by modelling sequences of random walks. The vectors of nodes obtained by this method can successfully encode most structural features of original networks. Subsequently, many approaches have been proposed to improve the embedding quality, such as structural deep network embedding (SDNE) [14], hierarchical community structure preserving approach for network embedding (HCNE) [15] and multi-view adversarial learning-based network embedding (MVANE) [16]. A large number of experimental results proved that the network embedding is effective for analyzing networks in different tasks, including visualization, node classification, link prediction and community detection [27]. However, most existing algorithms are only designed for single-layer networks without considering the multiple relations between nodes in multiplex networks.

Multilayer networks have emerged in recent years as an important new network science paradigm [18]. The term multilayer network is used here to refer to a variety of network models including interconnected networks [22], interdependent networks [23], multiplex networks [29], and networks of networks [25]. Although “multilayer” networks can actually be traced back to sociological and engineering problems of the late 1930s, the efforts of developing a theory of multilayer networks and methods for quantifying their structural properties is a matter of current research. In particular, the research findings that the seemingly irrelevant changes in one network can cause unexpected and catastrophic consequences in another network created a surge of interest in network science. After this, many groundbreaking studies related to multilayer networks were published. For example, diffusion dynamics [26], disease spread and prevention [28], evolutionary games [5], and network representation learning [33] have all become hot topics of general interest.

In this paper, the focus is mainly on featuring learning for multiplex networks. Multiplex networks are a special kind of multilayer network in which each layer contains the same set of nodes but is interconnected by different types of relations among nodes [29]. Traditionally, multiplex networks can be represented by adjacency tensors, vectors of adjacency matrices or super-adjacency [24]. In real-life complex systems, the relations between every two agents usually contain various different types [30]. For example, a group of business people may have competitions and friendships at the same time. Furthermore, different airlines have their routes between different airports. Compared with single-layer networks, multiplex networks are more suitable for describing them since all the relations can be best stored by multiplex networks [31]. Recent studies have also shown that multiplex networks have properties substantially distinct from single-layer networks. For instance, the robustness of multiplex networks with cascade properties and the breakdown of one relationship may lead to the breakdown of other relationships. Taking the relationship between business people as an example, if their friendships break down, then their businesses may also fail. In addition, in [32], Domenico et al. discovered that structures of different layers in some multiplex networks have some similarities, and the similarities between layers are significant to enlighten researchers in the study of multiplex networks. However, the existing network embedding algorithms cannot be directly applied to multiplex networks due to the substantially distinct properties and structure between each single-layer network. Thus, it is necessary to design a multiplex network embedding algorithm that can preserve the special structure of multiplex networks.

Compared with the single-layer network, the embedding for multiplex networks is supposed to exhibit two more basic properties. First, the embedding can take advantage of the information provided by each layer in the whole multiplex networks to improve its quality, not only the connections between the relevant nodes from the same layer. Second, different layers in multiplex networks can be embedded into a unified space. That is, nodes in different layers with identical structures should be represented by similar vectors. In this way, the structural similarity between layers in multiplex networks can be preserved. Recently, many studies have extended the concepts and methods in single-layer networks to learn multiplex network representations. Their main strategies fall into four categories. The first is to aggregate the multiplex networks into single-layer networks, and then embedding them using traditional embedding methods [33]. The second applies random walks to the whole multiplex networks including both inter- and intra-layers to generate the node sequences for embedding [34]. The identical drawback for these two strategies is that when some valuable connections extend to other layers, some incidental insignificant connections are also appended to them at the same time, which may destroy the original structure of multiplex networks. The third is to learn a common or layer vector as the base vector, and combine them with the separated learned nodes’ vector to obtain the final node representations. The difficulty for this kind of method is how to define the similarity between each layer in multiplex networks. The last category extends graph neural networks to learn multiplex network representations. However, for these methods, node labels are required in advance during training, which is sometimes unrealistic in real-life scenarios [38].

In this paper, a synchronous feature learning method for multiplex networks, termed Multi2vec, is proposed to solve the challenges in embedding the multiplex networks mentioned above. The main contributions of in this paper are summarized as follows.

  • The multi-order proximity and the problem of multiplex network embedding are defined to address the limitation of traditional embedding methods for multiplex networks.

  • A method for reconstructing the multiplex networks is proposed to preserve both local and global structures in the original multiplex networks, which is helpful for improving the embedding quality.

  • Extensive experiments on six different real-life multiplex networks are conducted and the experimental results demonstrate the superiority of Multi2vec on three tasks: visualization, link prediction and node classification.

The remainder of this paper is organized as follows. In Section 2, the related works on feature learning for both single-layer and multiplex networks are introduced briefly. The problem of synchronous multiplex network embedding is formally defined in Section 3. In Section 4, a detailed description of our proposed algorithm Multi2vec is provided. Section 5 demonstrates the experimental results obtained by Multi2vec and related analyses on three different tasks, together with comparisons with some state-of-the-art algorithms. Finally, the conclusion and future work about this work are given in Section 6.

Section snippets

Related work

In this section, the state-of-the-art network embedding algorithms for single-layer networks are reviewed first. Then, several embedding algorithms developed for multiplex networks are also introduced briefly.

Problem definition

In this section, the definition of multiplex networks is first given for a better understanding. Then three definitions of proximity at different level for multiplex networks are specifically described. Finally, the synchronous embedding strategy for multiplex networks is defined based on these proximities.

Definition 1 Multiplex networks

A multiplex network with L layers is defined as G = {V, E, L} where V and E are the sets of vertices and edges, respectively. Gl represents the lth layer of the multiplex network G. vil with

Synchronous feature learning for multiplex networks

In this section, the details of Multi2vec are described. First, a method to reconstruct multiplex networks is proposed. Then, the objective functions for preserving both the first- and second-order proximities are introduced. Finally, the framework of Multi2vec is given.

Experiments and analysis

In this section, six real-life multiplex networks are used to validate the performance of Multi2vec. Extensive experiments are conducted on three real applications: visualization, link prediction and node classification. The proposed Multi2vec is compared with some state-of-the-art embedding algorithms intended for single-layer and multiplex networks. Here, the algorithms for single-layer networks suppose each layer of multiplex networks as an independent network.

Conclusion and future work

In this paper, a novel synchronous feature learning method for multiplex networks termed Multi2vec is proposed. By reconstructing the multiplex networks, all the first-, second- and multi-order proximities in the multiplex networks can be well preserved during the embedding process. The synchronous learning strategy makes it possible to embed entire multiplex networks into a unified space effectively. The experimental results on three tasks, visualization, link prediction and node

Funding

This work was supported in part by the Key Project of Science and Technology Innovation 2030 supported by the Ministry of Science and Technology of China under Grant 2018AAA0101302 and in part by the General Program of National Natural Science Foundation of China (NSFC) under Grant 61773300.

CRediT authorship contribution statement

Xiangyi Teng: Formal analysis, Methodology, Software, Writing - Original draft preparation, Writing - review & editing. Jing Liu: Conceptualization, Funding acquisition, Resources, Supervision. Liqiang Li: Data curation, Investigation, Validation, Visualization. Hu Zhang: Project administration, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (50)

  • J.B. Tenenbaum et al.

    A global geometric framework for nonlinear dimensionality reduction

    Science

    (2000)
  • S.T. Roweis et al.

    Nonlinear dimensionality reduction by locally linear embedding

    Science

    (2000)
  • S. Yan et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • F. Wilcoxon

    Individual comparisons by ranking methods

    Biometrics

    (1945)
  • M. Belkin et al.

    Laplacian eigenmaps and spectral techniques for embedding and clustering

    Adv. Neural Inf. Process. Syst.

    (2002)
  • B. Perozzi et al.

    Deepwalk: Online learning of social representations

  • T. Mikolov et al.

    Efficient estimation of word representations in vector space

    Comput. Sci.

    (2013)
  • T. Mikolov et al.

    Distributed representations of words and phrases and their compositionality

    Adv. Neural Inf. Process. Syst.

    (2013)
  • A. Grover et al.

    node2vec: Scalable feature learning for networks

  • D. Wang et al.

    Structural deep network embedding

  • Z. Duan et al.

    Hierarchical community structure preserving approach for network embedding

    Inf. Sci.

    (2020)
  • J. Tang et al.

    Line: Large-scale information network embedding

  • Y. Moreno et al.

    Focus on multilayer networks

    New J. Phys.

    (2020)
  • T. N. Kipf, M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv: 1609.02907...
  • P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, “Graph attention networks,” arXiv:1710.10903...
  • Cited by (7)

    • Single-particle optimization for network embedding preserving both local and global information

      2022, Swarm and Evolutionary Computation
      Citation Excerpt :

      Tang et al. [12] presented the large-scale information network embedding (LINE) that carefully designs two objective functions to save both the first- and second-order similarities of nodes and outperforms DeepWalk on some large-scale network tasks. Then, Teng et al. [13] offered Multi2vec that extends the objective functions of LINE to multiplex networks and shows more superiority than DeepWalk and LINE on various multiplex network issues. Though the methods proposed in [8–13] have made some progress, it is challenging for their established shallow models to capture the nonlinear information of networks.

    • MTGK: Multi-source cross-network node classification via transferable graph knowledge

      2022, Information Sciences
      Citation Excerpt :

      Although they have succeeded in improving performance by using node attribute features, these approaches have never considered the scenarios of multiple source networks. There are also some approaches on multi-network embedding [36–40], which aims to learn low-dimensional vector representations for nodes in multiple heterogeneous information networks. Deep Multi-Network Embedding (DMNE) [36] is a significant breakthrough for network representations, which tends to address cross-network relationships to boost the learning of node embeddings in multiple interconnected heterogeneous networks.

    • Amer: A New Attribute-Missing Network Embedding Approach

      2023, IEEE Transactions on Cybernetics
    • An Online Intelligent System based on Multi-Network Information Sharing Algorithm

      2022, Proceedings - International Conference on Augmented Intelligence and Sustainable Systems, ICAISS 2022
    View all citing articles on Scopus
    View full text