HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment
Introduction
Graph-based machine learning techniques have shown considerable potential in solving a wide range of problems in a variety of fields including biological science (protein interactions (Theocharidis, Van Dongen, Enright, & Freeman, 2009), drug side effects (Shabani-Mashcool, Marashi, & Gharaghani, 2020)), social network analysis (friendship networks) (Eagle et al., 2006) and linguistics (word co-occurrence networks) (Cancho & Solé, 2001). Hence, study of real world networks to mine and gather insights has gained considerable interest (Zarrinkalam, Kahani, & Bagheri, 2018). While analyzing a single network is essential for various applications like detecting communities (Fani et al., 2020), predicting links (Wu, Zhang, & Ren, 2017), and user modeling (Zhang, 2008), some problems, such as graph clustering (Wang et al., 2021) and alignment (Goga, Loiseau, Sommer, Teixeira, & Gummadi, 2015), cannot be solved without addressing the interactions between graphs. Network alignment refers to the mapping of the same entities (termed as anchor nodes) across different networks (Goga et al., 2015, Zhang and Philip, 2015). It finds several applications in diverse areas like biological networks, social networks, and computer vision, to name a few (Bayati, Gleich, Saberi, & Wang, 2013). Finding node correspondences across networks also alleviates the sparsity issue of analyzing from a single network, like a social network, and facilitates downstream tasks like friend and product recommendation (Ahmadian et al., 2018, Sun et al., 2015, Xin and Wu, 2020).
Most of the recent network alignment approaches mainly rely on deep learning-based node embedding techniques to perform alignment with the help of labeled anchor node information (Kaushal et al., 2020, Liu et al., 2016, Liu et al., 2019, Man et al., 2016, Zhou et al., 2018) that supervises the alignment process. However, labeling these anchor nodes requires domain knowledge and extensive manual effort. Furthermore, utilizing only a small number of labeled anchor nodes may not be enough to capture the structural variations of all the nodes in the network, resulting in poor generalization. Hence, several recent semi-supervised and unsupervised approaches learn the anchor node mappings purely from network architecture and node attributes (Du et al., 2019, Hong et al., 2020, Trung, Van Vinh, Tam, Yin, Weidlich, and Hung, 2020). However, a major issue with these approaches is the underlying assumption of strong topological and attribute consistency constraints (Heimann, Shen, Safavi, & Koutra, 2018). The topological consistency constraint presumes that two nodes close in one network are likewise close in the other, but the attribute consistency constraint demands that related nodes in different networks have the same attribute values. In actuality, however, such assumptions are frequently broken. Users’ social connections on Facebook and LinkedIn, for example, may be completely different; also, because LinkedIn is a professional network, the user information made public on LinkedIn may be quite different from that on Facebook.
Another central assumption is that the sampled networks for alignment fully capture the underlying topologies and node attributes (Koutra et al., 2013, Zhang and Philip, 2015). As a result, these approaches map the closer nodes in the embedding space under strict notions of consistency constraints. However, this is a bold assumption. It is challenging to collect complete and precise knowledge about the topology and node attributes of many large real-world networks, such as Twitter and Facebook. Both missing links and nodes may exist in the sampled networks.
To alleviate this intrinsic structural and attribute noise in the networks, we draw inspiration from the recent success of self-supervised contrastive learning approaches for creating graph representations. The main idea is to capture network variations (or noise) by learning from its multiple random views (Hassani & Khasahmadi, 2020). However, the structural variations of different networks are not purely random; hence direct application of these techniques for creating simultaneous node representations for anchor node mapping may be ineffective. Instead, these variations demand appropriate guidance of the contrastive learning mechanism to capture the broad perspective of structural possibilities relevant to network alignment in real-world scenarios.
In this paper, we introduce a novel network guided self-supervised contrastive learning framework, HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment, that uses hyperbolic graph convolution networks (GCNs) to build node representations. The resulting hyperbolic representations are subsequently used to align the two graphs. HCNA’s main uniqueness is that it correctly guides the graph augmentation process to ensure that the graph views created are consistent with real-world structural properties. The hyperbolic embedding space (rather than Euclidean) is used to further strengthen the power of contrastive learning by better capturing the hierarchical structures of the real-world networks. Although a recent study (Xiong, Yan, & Pan, 2021) uses contrastive learning to perform alignment of multiplex networks by modeling their multiple structural views simultaneously through an end-to-end deep learning framework, it uses labeled anchor nodes. Hence, the major concerns with the supervised techniques discussed above persist. In contrast to this prior work, our proposed technique is unsupervised and analyses the views in three dimensions, the local and global structure of networks, as well as the node attributes.
A summary of our contributions is as follows:
- 1.
To the best of our knowledge, this is the first work on unsupervised network alignment that relies on self-supervised contrastive learning.
- 2.
The proposed model can capture the variances and hierarchical structures of real-life networks as it generates and utilizes multiple views of both the networks to generate embedding in the hyperbolic space. Therefore, it effectively bridges contrastive learning with hyperbolic representation learning for network alignment.
- 3.
Our experimental investigations on four real-life datasets show that the proposed model achieves better performance compared to the ten state-of-the-art research works in terms of , , , and .
- 4.
Extensive experiments on the adaptivity analysis during adversarial conditions, such as structural noises, attribute noises, and graph size imbalance, show that our model is more resilient to structural and attribute noises than ten state-of-the-art research works. Additionally, ablation experiments to study the model design show the effectiveness of each component of the proposed model and the proposed loss function.
The organization of the paper is as follows. We briefly outline the related works in Section 2, the background and problem statement in Section 3. In Section 4, we discuss the proposed model in details followed by the experimental details in Section 5. The results are discussed in Section 6 and we draw the conclusions in Section 7.
Section snippets
Related works
Network alignment has two phases — generating a low-dimensional vector space representation of nodes of each network and their utilization for node matching across networks. Therefore, existing network alignment approaches differ either based on network embedding approaches or utilization of information for mapping nodes based on a similarity measure. In this Section, we initially elaborate on the network embedding approaches that are widely used for network alignment followed by
Self supervised contrastive learning
The primary goal of self-supervised learning is to learn the data’s rich structures from the data itself, rather than relying on sparsely labeled data that may not be sufficient for broad learning (Kolesnikov et al., 2019, Liu et al., 2021). Self-supervised contrastive learning implements this principle of learning representations by contrasting positive and negative examples in the data classes. These examples are derived from different views of the same graph to ensure self-supervision while
Proposed approach
In this Section, we explain the overall alignment architecture of our proposed model, HCNA, in detail. We begin with a general overview of the model, then go over its each component in detail.
Experimental details
In this section, we initially discuss the real-world datasets that we use for our experiments in Section 5.1 followed by the existing research works on network alignment in Section 5.2. We discuss the different metrics that we use to evaluate HCNA and the baselines in Section 5.3 followed by the implementation details in Section 5.4.
Results and discussions
In this section, we provide the experimental results with their detailed analysis. In Section 6.1, we compare the performance of HCNA with several existing supervised and unsupervised network alignment approaches. We discuss the adaptivity of HCNA to adversarial conditions such as robustness to structural noise, attribute noise, and graph size imbalance in Section 6.2. We further perform ablation experiments on the model design to verify the importance of each of its components in Section 6.3.
Conclusion and future works
This paper proposes HCNA, an end-to-end unsupervised framework for network alignment on attributed networks. From a methodological standpoint, HCNA goes beyond the existing methods in several ways. First, it leverages the power of contrastive learning to implement a fully unsupervised network alignment approach that mitigates the need for manual labeling of anchor nodes in supervised settings. The contrastive learning approach better handles the noises and structural as well as the node
CRediT authorship contribution statement
Shruti Saxena: Conceptualization, Methodology, Implementation, Data curation, Writing. Roshni Chakraborty: Conceptualization, Writing – review & editing. Joydeep Chandra: Conceptualization, Writing – review & editing, Supervision.
References (77)
- et al.
A social recommendation method based on an adaptive neighbor selection mechanism
Information Processing & Management
(2018) - et al.
Multi-heterogeneous neighborhood-aware for knowledge graphs alignment
Information Processing & Management
(2022) - et al.
User community detection via embedding of social network structure and temporal content
Information Processing & Management
(2020) - et al.
Enrich cross-lingual entity links for online wikis via multi-modal semantic matching
Information Processing & Management
(2020) - et al.
Rebuilding community ecology from functional traits
Trends in Ecology & Evolution
(2006) - et al.
Structural representation learning for network alignment with self-supervised anchor links
Expert Systems with Applications
(2021) - et al.
NDDSA: A network-and domain-based method for predicting drug-side effect associations
Information Processing & Management
(2020) - et al.
Mining affective text to improve social media item recommendation
Information Processing & Management
(2015) - et al.
Trio-based collaborative multi-view graph clustering with multiple constraints
Information Processing & Management
(2021) - et al.
A balanced modularity maximization link prediction model in social networks
Information Processing & Management
(2017)
Using multi-features to partition users for friends recommendation in location based social network
Information Processing & Management
Mining user interests over active topics on social networks
Information Processing & Management
Complex adaptive filtering user profile using graphical models
Information Processing & Management
A survey on concept factorization: From shallow to deep representation learning
Information Processing & Management
Tree-like structure in large social and information networks
Message-passing algorithms for sparse network alignment
ACM Transactions on Knowledge Discovery from Data (TKDD)
Laplacian eigenmaps and spectral techniques for embedding and clustering.
Geometric deep learning: Going beyond euclidean data
IEEE Signal Processing Magazine
A comprehensive survey of graph embedding: Problems, techniques, and applications
IEEE Transactions on Knowledge and Data Engineering
The small world of human language
Proceedings of the Royal Society of London, Series B
Hyperbolic graph convolutional neural networks
Advances in Neural Information Processing Systems
Unsupervised adversarial graph alignment with graph embedding
Joint link prediction and network alignment via cross-graph embedding
Non-euclidean dissimilarities: Causes and informativeness
Reality mining: Sensing complex social systems
Personal and Ubiquitous Computing
Digital image processing
A simple generalisation of the area under the ROC curve for multiple class classification problems
Machine Learning
Contrastive multi-view representation learning on graphs
Domain-adversarial network alignment
IEEE Transactions on Knowledge and Data Engineering
Scaled Gromov hyperbolic graphs
Journal of Graph Theory
NeXLink: Node embedding framework for cross-network linkages across social networks
Cited by (6)
Learning context-aware region similarity with effective spatial normalization over Point-of-Interest data
2024, Information Processing and ManagementDAWN: Domain Generalization Based Network Alignment
2023, IEEE Transactions on Big DataAligning Users across Social Networks via Integrating Structural Similarity and Graph Representation Learning
2023, 2023 9th International Conference on Big Data and Information Analytics, BigDIA 2023 - ProceedingsEvoAlign: A Continual Learning Framework for Aligning Evolving Networks
2023, 2023 IEEE 10th International Conference on Data Science and Advanced Analytics, DSAA 2023 - ProceedingsSAlign: A Graph Neural Attention Framework for Aligning Structurally Heterogeneous Networks
2023, Journal of Artificial Intelligence Research