A synchronous feature learning method for multiplex network embedding
Introduction
Network embedding, which aims to represent nodes using dense vectors, has been studied for decades [1]. Traditional unsupervised feature learning methods mainly exploit the spectral properties of networks [2], and both linear and nonlinear methods [4] have been proposed for dimension reduction [7]. However, this kind of method consumes many statistical and computational resources, making it inefficient for handling large-scale networks. Recently, an online representation learning algorithm [10], which was inspired by word2vec [11], was proposed to learn the representation of nodes by modelling sequences of random walks. The vectors of nodes obtained by this method can successfully encode most structural features of original networks. Subsequently, many approaches have been proposed to improve the embedding quality, such as structural deep network embedding (SDNE) [14], hierarchical community structure preserving approach for network embedding (HCNE) [15] and multi-view adversarial learning-based network embedding (MVANE) [16]. A large number of experimental results proved that the network embedding is effective for analyzing networks in different tasks, including visualization, node classification, link prediction and community detection [27]. However, most existing algorithms are only designed for single-layer networks without considering the multiple relations between nodes in multiplex networks.
Multilayer networks have emerged in recent years as an important new network science paradigm [18]. The term multilayer network is used here to refer to a variety of network models including interconnected networks [22], interdependent networks [23], multiplex networks [29], and networks of networks [25]. Although “multilayer” networks can actually be traced back to sociological and engineering problems of the late 1930s, the efforts of developing a theory of multilayer networks and methods for quantifying their structural properties is a matter of current research. In particular, the research findings that the seemingly irrelevant changes in one network can cause unexpected and catastrophic consequences in another network created a surge of interest in network science. After this, many groundbreaking studies related to multilayer networks were published. For example, diffusion dynamics [26], disease spread and prevention [28], evolutionary games [5], and network representation learning [33] have all become hot topics of general interest.
In this paper, the focus is mainly on featuring learning for multiplex networks. Multiplex networks are a special kind of multilayer network in which each layer contains the same set of nodes but is interconnected by different types of relations among nodes [29]. Traditionally, multiplex networks can be represented by adjacency tensors, vectors of adjacency matrices or super-adjacency [24]. In real-life complex systems, the relations between every two agents usually contain various different types [30]. For example, a group of business people may have competitions and friendships at the same time. Furthermore, different airlines have their routes between different airports. Compared with single-layer networks, multiplex networks are more suitable for describing them since all the relations can be best stored by multiplex networks [31]. Recent studies have also shown that multiplex networks have properties substantially distinct from single-layer networks. For instance, the robustness of multiplex networks with cascade properties and the breakdown of one relationship may lead to the breakdown of other relationships. Taking the relationship between business people as an example, if their friendships break down, then their businesses may also fail. In addition, in [32], Domenico et al. discovered that structures of different layers in some multiplex networks have some similarities, and the similarities between layers are significant to enlighten researchers in the study of multiplex networks. However, the existing network embedding algorithms cannot be directly applied to multiplex networks due to the substantially distinct properties and structure between each single-layer network. Thus, it is necessary to design a multiplex network embedding algorithm that can preserve the special structure of multiplex networks.
Compared with the single-layer network, the embedding for multiplex networks is supposed to exhibit two more basic properties. First, the embedding can take advantage of the information provided by each layer in the whole multiplex networks to improve its quality, not only the connections between the relevant nodes from the same layer. Second, different layers in multiplex networks can be embedded into a unified space. That is, nodes in different layers with identical structures should be represented by similar vectors. In this way, the structural similarity between layers in multiplex networks can be preserved. Recently, many studies have extended the concepts and methods in single-layer networks to learn multiplex network representations. Their main strategies fall into four categories. The first is to aggregate the multiplex networks into single-layer networks, and then embedding them using traditional embedding methods [33]. The second applies random walks to the whole multiplex networks including both inter- and intra-layers to generate the node sequences for embedding [34]. The identical drawback for these two strategies is that when some valuable connections extend to other layers, some incidental insignificant connections are also appended to them at the same time, which may destroy the original structure of multiplex networks. The third is to learn a common or layer vector as the base vector, and combine them with the separated learned nodes’ vector to obtain the final node representations. The difficulty for this kind of method is how to define the similarity between each layer in multiplex networks. The last category extends graph neural networks to learn multiplex network representations. However, for these methods, node labels are required in advance during training, which is sometimes unrealistic in real-life scenarios [38].
In this paper, a synchronous feature learning method for multiplex networks, termed Multi2vec, is proposed to solve the challenges in embedding the multiplex networks mentioned above. The main contributions of in this paper are summarized as follows.
- •
The multi-order proximity and the problem of multiplex network embedding are defined to address the limitation of traditional embedding methods for multiplex networks.
- •
A method for reconstructing the multiplex networks is proposed to preserve both local and global structures in the original multiplex networks, which is helpful for improving the embedding quality.
- •
Extensive experiments on six different real-life multiplex networks are conducted and the experimental results demonstrate the superiority of Multi2vec on three tasks: visualization, link prediction and node classification.
The remainder of this paper is organized as follows. In Section 2, the related works on feature learning for both single-layer and multiplex networks are introduced briefly. The problem of synchronous multiplex network embedding is formally defined in Section 3. In Section 4, a detailed description of our proposed algorithm Multi2vec is provided. Section 5 demonstrates the experimental results obtained by Multi2vec and related analyses on three different tasks, together with comparisons with some state-of-the-art algorithms. Finally, the conclusion and future work about this work are given in Section 6.
Section snippets
Related work
In this section, the state-of-the-art network embedding algorithms for single-layer networks are reviewed first. Then, several embedding algorithms developed for multiplex networks are also introduced briefly.
Problem definition
In this section, the definition of multiplex networks is first given for a better understanding. Then three definitions of proximity at different level for multiplex networks are specifically described. Finally, the synchronous embedding strategy for multiplex networks is defined based on these proximities. Definition 1 Multiplex networks A multiplex network with L layers is defined as G = {V, E, L} where V and E are the sets of vertices and edges, respectively. Gl represents the lth layer of the multiplex network G. with
Synchronous feature learning for multiplex networks
In this section, the details of Multi2vec are described. First, a method to reconstruct multiplex networks is proposed. Then, the objective functions for preserving both the first- and second-order proximities are introduced. Finally, the framework of Multi2vec is given.
Experiments and analysis
In this section, six real-life multiplex networks are used to validate the performance of Multi2vec. Extensive experiments are conducted on three real applications: visualization, link prediction and node classification. The proposed Multi2vec is compared with some state-of-the-art embedding algorithms intended for single-layer and multiplex networks. Here, the algorithms for single-layer networks suppose each layer of multiplex networks as an independent network.
Conclusion and future work
In this paper, a novel synchronous feature learning method for multiplex networks termed Multi2vec is proposed. By reconstructing the multiplex networks, all the first-, second- and multi-order proximities in the multiplex networks can be well preserved during the embedding process. The synchronous learning strategy makes it possible to embed entire multiplex networks into a unified space effectively. The experimental results on three tasks, visualization, link prediction and node
Funding
This work was supported in part by the Key Project of Science and Technology Innovation 2030 supported by the Ministry of Science and Technology of China under Grant 2018AAA0101302 and in part by the General Program of National Natural Science Foundation of China (NSFC) under Grant 61773300.
CRediT authorship contribution statement
Xiangyi Teng: Formal analysis, Methodology, Software, Writing - Original draft preparation, Writing - review & editing. Jing Liu: Conceptualization, Funding acquisition, Resources, Supervision. Liqiang Li: Data curation, Investigation, Validation, Visualization. Hu Zhang: Project administration, Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (50)
- et al.
Principal component analysis
Chem. Intelligent Lab. Syst.
(1987) - et al.
Evolutionary games on multilayer networks: a colloquium
Eur. Phys. J. B
(2015) - et al.
Adversarial learning for multi-view network embedding on incomplete graphs
Knowl.-Based Syst.
(2019) - et al.
Structured subspace embedding on attributed networks
Inf. Sci.
(2020) - et al.
TPNE: topology preserving network embedding
Inf. Sci.
(2019) - et al.
The structure and dynamics of multilayer networks
Phys. Rep.
(2014) - et al.
The aggregation of multiplex networks based on the similarity of networks
Physica A
(2020) - et al.
Friends and neighbors on the web
Social Networks
(2003) - et al.
Spectral graph theory
American Mathematical Soc.
(1997) - et al.
Multidimensional Scaling
(2000)
A global geometric framework for nonlinear dimensionality reduction
Science
Nonlinear dimensionality reduction by locally linear embedding
Science
Graph embedding and extensions: a general framework for dimensionality reduction
IEEE Trans. Pattern Anal. Mach. Intell.
Individual comparisons by ranking methods
Biometrics
Laplacian eigenmaps and spectral techniques for embedding and clustering
Adv. Neural Inf. Process. Syst.
Deepwalk: Online learning of social representations
Efficient estimation of word representations in vector space
Comput. Sci.
Distributed representations of words and phrases and their compositionality
Adv. Neural Inf. Process. Syst.
node2vec: Scalable feature learning for networks
Structural deep network embedding
Hierarchical community structure preserving approach for network embedding
Inf. Sci.
Line: Large-scale information network embedding
Focus on multilayer networks
New J. Phys.
Cited by (7)
Single-particle optimization for network embedding preserving both local and global information
2022, Swarm and Evolutionary ComputationCitation Excerpt :Tang et al. [12] presented the large-scale information network embedding (LINE) that carefully designs two objective functions to save both the first- and second-order similarities of nodes and outperforms DeepWalk on some large-scale network tasks. Then, Teng et al. [13] offered Multi2vec that extends the objective functions of LINE to multiplex networks and shows more superiority than DeepWalk and LINE on various multiplex network issues. Though the methods proposed in [8–13] have made some progress, it is challenging for their established shallow models to capture the nonlinear information of networks.
MTGK: Multi-source cross-network node classification via transferable graph knowledge
2022, Information SciencesCitation Excerpt :Although they have succeeded in improving performance by using node attribute features, these approaches have never considered the scenarios of multiple source networks. There are also some approaches on multi-network embedding [36–40], which aims to learn low-dimensional vector representations for nodes in multiple heterogeneous information networks. Deep Multi-Network Embedding (DMNE) [36] is a significant breakthrough for network representations, which tends to address cross-network relationships to boost the learning of node embeddings in multiple interconnected heterogeneous networks.
Amer: A New Attribute-Missing Network Embedding Approach
2023, IEEE Transactions on CyberneticsMultiple similarity drug-target interaction prediction with random walks and matrix factorization
2022, Briefings in BioinformaticsAn Online Intelligent System based on Multi-Network Information Sharing Algorithm
2022, Proceedings - International Conference on Augmented Intelligence and Sustainable Systems, ICAISS 2022