HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment

https://doi.org/10.1016/j.ipm.2022.103021Get rights and content

Highlights

  • A novel self-supervised contrastive learning-based technique to align entities across two networks.

  • We introduce network specific guided augmentations to generate multiple graph views.

  • We employ multi-order hyperbolic graph convolution networks to capture higher order hierarchical structures.

  • Our extensive experiments show that HCNA consistently outperforms the baselines by at least 1–84% in terms of accuracy score.

Abstract

Network alignment, or identifying the same entities (anchors) across multiple networks, has significant applications across diverse fields. Unsupervised approaches for network alignment, though popular, strictly assume that the anchor nodes’ structure and attributes remain consistent across different networks. However, in practice, strictly adhering to these constraints makes it difficult to deal with networks with high variance in the structural characteristics and inherent structural noises like missing nodes and edges, resulting in poor generalization. In order to handle these shortcomings, we propose HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment, a novel self-supervised contrastive learning model which learns from the multiple augmented views of each network, thereby making HCNA robust to the inherent multi-network characteristics. Furthermore, we propose multi-order hyperbolic graph convolution networks to generate node embedding for each network which can handle the hierarchical structure of networks. The main objective of HCNA is to obtain structure-preserving embeddings that are also robust to noises and variations for better alignment results. The major novelty lies in generating augmented multiple graph views for contrastive learning that are driven by real world network dynamics. Rigorous investigations on 4 real datasets show that HCNA consistently outperforms the baselines by at least 184% in terms of accuracy score. Furthermore, HCNA is also more resilient to structural and attributes noises, as evidenced by its adaptivity analysis on adversarial conditions.

Introduction

Graph-based machine learning techniques have shown considerable potential in solving a wide range of problems in a variety of fields including biological science (protein interactions (Theocharidis, Van Dongen, Enright, & Freeman, 2009), drug side effects (Shabani-Mashcool, Marashi, & Gharaghani, 2020)), social network analysis (friendship networks) (Eagle et al., 2006) and linguistics (word co-occurrence networks) (Cancho & Solé, 2001). Hence, study of real world networks to mine and gather insights has gained considerable interest (Zarrinkalam, Kahani, & Bagheri, 2018). While analyzing a single network is essential for various applications like detecting communities (Fani et al., 2020), predicting links (Wu, Zhang, & Ren, 2017), and user modeling (Zhang, 2008), some problems, such as graph clustering (Wang et al., 2021) and alignment (Goga, Loiseau, Sommer, Teixeira, & Gummadi, 2015), cannot be solved without addressing the interactions between graphs. Network alignment refers to the mapping of the same entities (termed as anchor nodes) across different networks (Goga et al., 2015, Zhang and Philip, 2015). It finds several applications in diverse areas like biological networks, social networks, and computer vision, to name a few (Bayati, Gleich, Saberi, & Wang, 2013). Finding node correspondences across networks also alleviates the sparsity issue of analyzing from a single network, like a social network, and facilitates downstream tasks like friend and product recommendation (Ahmadian et al., 2018, Sun et al., 2015, Xin and Wu, 2020).

Most of the recent network alignment approaches mainly rely on deep learning-based node embedding techniques to perform alignment with the help of labeled anchor node information (Kaushal et al., 2020, Liu et al., 2016, Liu et al., 2019, Man et al., 2016, Zhou et al., 2018) that supervises the alignment process. However, labeling these anchor nodes requires domain knowledge and extensive manual effort. Furthermore, utilizing only a small number of labeled anchor nodes may not be enough to capture the structural variations of all the nodes in the network, resulting in poor generalization. Hence, several recent semi-supervised and unsupervised approaches learn the anchor node mappings purely from network architecture and node attributes (Du et al., 2019, Hong et al., 2020, Trung, Van Vinh, Tam, Yin, Weidlich, and Hung, 2020). However, a major issue with these approaches is the underlying assumption of strong topological and attribute consistency constraints (Heimann, Shen, Safavi, & Koutra, 2018). The topological consistency constraint presumes that two nodes close in one network are likewise close in the other, but the attribute consistency constraint demands that related nodes in different networks have the same attribute values. In actuality, however, such assumptions are frequently broken. Users’ social connections on Facebook and LinkedIn, for example, may be completely different; also, because LinkedIn is a professional network, the user information made public on LinkedIn may be quite different from that on Facebook.

Another central assumption is that the sampled networks for alignment fully capture the underlying topologies and node attributes (Koutra et al., 2013, Zhang and Philip, 2015). As a result, these approaches map the closer nodes in the embedding space under strict notions of consistency constraints. However, this is a bold assumption. It is challenging to collect complete and precise knowledge about the topology and node attributes of many large real-world networks, such as Twitter and Facebook. Both missing links and nodes may exist in the sampled networks.

To alleviate this intrinsic structural and attribute noise in the networks, we draw inspiration from the recent success of self-supervised contrastive learning approaches for creating graph representations. The main idea is to capture network variations (or noise) by learning from its multiple random views (Hassani & Khasahmadi, 2020). However, the structural variations of different networks are not purely random; hence direct application of these techniques for creating simultaneous node representations for anchor node mapping may be ineffective. Instead, these variations demand appropriate guidance of the contrastive learning mechanism to capture the broad perspective of structural possibilities relevant to network alignment in real-world scenarios.

In this paper, we introduce a novel network guided self-supervised contrastive learning framework, HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment, that uses hyperbolic graph convolution networks (GCNs) to build node representations. The resulting hyperbolic representations are subsequently used to align the two graphs. HCNA’s main uniqueness is that it correctly guides the graph augmentation process to ensure that the graph views created are consistent with real-world structural properties. The hyperbolic embedding space (rather than Euclidean) is used to further strengthen the power of contrastive learning by better capturing the hierarchical structures of the real-world networks. Although a recent study (Xiong, Yan, & Pan, 2021) uses contrastive learning to perform alignment of multiplex networks by modeling their multiple structural views simultaneously through an end-to-end deep learning framework, it uses labeled anchor nodes. Hence, the major concerns with the supervised techniques discussed above persist. In contrast to this prior work, our proposed technique is unsupervised and analyses the views in three dimensions, the local and global structure of networks, as well as the node attributes.

A summary of our contributions is as follows:

  • 1.

    To the best of our knowledge, this is the first work on unsupervised network alignment that relies on self-supervised contrastive learning.

  • 2.

    The proposed model can capture the variances and hierarchical structures of real-life networks as it generates and utilizes multiple views of both the networks to generate embedding in the hyperbolic space. Therefore, it effectively bridges contrastive learning with hyperbolic representation learning for network alignment.

  • 3.

    Our experimental investigations on four real-life datasets show that the proposed model achieves better performance compared to the ten state-of-the-art research works in terms of Acc@1, Acc@10, AUC, and MAP.

  • 4.

    Extensive experiments on the adaptivity analysis during adversarial conditions, such as structural noises, attribute noises, and graph size imbalance, show that our model is more resilient to structural and attribute noises than ten state-of-the-art research works. Additionally, ablation experiments to study the model design show the effectiveness of each component of the proposed model and the proposed loss function.

The organization of the paper is as follows. We briefly outline the related works in Section 2, the background and problem statement in Section 3. In Section 4, we discuss the proposed model in details followed by the experimental details in Section 5. The results are discussed in Section 6 and we draw the conclusions in Section 7.

Section snippets

Related works

Network alignment has two phases — (1) generating a low-dimensional vector space representation of nodes of each network and (2) their utilization for node matching across networks. Therefore, existing network alignment approaches differ either based on network embedding approaches or utilization of information for mapping nodes based on a similarity measure. In this Section, we initially elaborate on the network embedding approaches that are widely used for network alignment followed by

Self supervised contrastive learning

The primary goal of self-supervised learning is to learn the data’s rich structures from the data itself, rather than relying on sparsely labeled data that may not be sufficient for broad learning (Kolesnikov et al., 2019, Liu et al., 2021). Self-supervised contrastive learning implements this principle of learning representations by contrasting positive and negative examples in the data classes. These examples are derived from different views of the same graph to ensure self-supervision while

Proposed approach

In this Section, we explain the overall alignment architecture of our proposed model, HCNA, in detail. We begin with a general overview of the model, then go over its each component in detail.

Experimental details

In this section, we initially discuss the 4 real-world datasets that we use for our experiments in Section 5.1 followed by the existing research works on network alignment in Section 5.2. We discuss the different metrics that we use to evaluate HCNA and the baselines in Section 5.3 followed by the implementation details in Section 5.4.

Results and discussions

In this section, we provide the experimental results with their detailed analysis. In Section 6.1, we compare the performance of HCNA with several existing supervised and unsupervised network alignment approaches. We discuss the adaptivity of HCNA to adversarial conditions such as robustness to structural noise, attribute noise, and graph size imbalance in Section 6.2. We further perform ablation experiments on the model design to verify the importance of each of its components in Section 6.3.

Conclusion and future works

This paper proposes HCNA, an end-to-end unsupervised framework for network alignment on attributed networks. From a methodological standpoint, HCNA goes beyond the existing methods in several ways. First, it leverages the power of contrastive learning to implement a fully unsupervised network alignment approach that mitigates the need for manual labeling of anchor nodes in supervised settings. The contrastive learning approach better handles the noises and structural as well as the node

CRediT authorship contribution statement

Shruti Saxena: Conceptualization, Methodology, Implementation, Data curation, Writing. Roshni Chakraborty: Conceptualization, Writing – review & editing. Joydeep Chandra: Conceptualization, Writing – review & editing, Supervision.

References (77)

  • XinM. et al.

    Using multi-features to partition users for friends recommendation in location based social network

    Information Processing & Management

    (2020)
  • ZarrinkalamF. et al.

    Mining user interests over active topics on social networks

    Information Processing & Management

    (2018)
  • ZhangY.

    Complex adaptive filtering user profile using graphical models

    Information Processing & Management

    (2008)
  • ZhangZ. et al.

    A survey on concept factorization: From shallow to deep representation learning

    Information Processing & Management

    (2021)
  • AdcockA.B. et al.

    Tree-like structure in large social and information networks

  • BayatiM. et al.

    Message-passing algorithms for sparse network alignment

    ACM Transactions on Knowledge Discovery from Data (TKDD)

    (2013)
  • BelkinM. et al.

    Laplacian eigenmaps and spectral techniques for embedding and clustering.

  • BronsteinM.M. et al.

    Geometric deep learning: Going beyond euclidean data

    IEEE Signal Processing Magazine

    (2017)
  • CaiH. et al.

    A comprehensive survey of graph embedding: Problems, techniques, and applications

    IEEE Transactions on Knowledge and Data Engineering

    (2018)
  • CanchoR.F.I. et al.

    The small world of human language

    Proceedings of the Royal Society of London, Series B

    (2001)
  • ChamiI. et al.

    Hyperbolic graph convolutional neural networks

    Advances in Neural Information Processing Systems

    (2019)
  • Chen, B., Huang, X., Xiao, L., Cai, Z., & Jing, L. (2020). Hyperbolic interaction model for hierarchical multi-label...
  • ChenC. et al.

    Unsupervised adversarial graph alignment with graph embedding

    (2019)
  • Cheng, A., Zhou, C., Yang, H., Wu, J., Li, L., Tan, J., et al. (2019). Deep active learning for anchor user prediction....
  • Derr, T., Karimi, H., Liu, X., Xu, J., & Tang, J. (2021). Deep adversarial network alignment. In Proceedings of the...
  • DuX. et al.

    Joint link prediction and network alignment via cross-graph embedding

  • DuinR.P. et al.

    Non-euclidean dissimilarities: Causes and informativeness

  • EagleN.

    Reality mining: Sensing complex social systems

    Personal and Ubiquitous Computing

    (2006)
  • Goga, O., Loiseau, P., Sommer, R., Teixeira, R., & Gummadi, K. P. (2015). On the reliability of profile matching across...
  • GonzalezR.C. et al.

    Digital image processing

    (1992)
  • Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM...
  • HandD.J. et al.

    A simple generalisation of the area under the ROC curve for multiple class classification problems

    Machine Learning

    (2001)
  • HassaniK. et al.

    Contrastive multi-view representation learning on graphs

  • Heimann, M., Shen, H., Safavi, T., & Koutra, D. (2018). Regal: Representation learning-based graph alignment. In...
  • HongH. et al.

    Domain-adversarial network alignment

    IEEE Transactions on Knowledge and Data Engineering

    (2020)
  • JonckheereE. et al.

    Scaled Gromov hyperbolic graphs

    Journal of Graph Theory

    (2008)
  • KaushalR. et al.

    NeXLink: Node embedding framework for cross-network linkages across social networks

  • Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In International...
  • Cited by (6)

    View full text