当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning
arXiv - CS - Machine Learning Pub Date : 2020-09-22 , DOI: arxiv-2009.10273
Yizhu Jiao, Yun Xiong, Jiawei Zhang, Yao Zhang, Tianqi Zhang, Yangyong Zhu

Graph representation learning has attracted lots of attention recently. Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs. Thus, it remains a great challenge to capture rich information in large-scale graph data. Besides, these methods mainly focus on supervised learning and highly depend on node label information, which is expensive to obtain in the real world. As to unsupervised network embedding approaches, they overemphasize node proximity instead, whose learned representations can hardly be used in downstream application tasks directly. In recent years, emerging self-supervised learning provides a potential solution to address the aforementioned problems. However, existing self-supervised works also operate on the complete graph data and are biased to fit either global or very local (1-hop neighborhood) graph structures in defining the mutual information based loss terms. In this paper, a novel self-supervised representation learning method via Subgraph Contrast, namely \textsc{Subg-Con}, is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information. Instead of learning on the complete input graph data, with a novel data augmentation strategy, \textsc{Subg-Con} learns node representations through a contrastive loss defined based on subgraphs sampled from the original graph instead. Compared with existing graph representation learning approaches, \textsc{Subg-Con} has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization. Extensive experiments verify both the effectiveness and the efficiency of our work compared with both classic and state-of-the-art graph representation learning approaches on multiple real-world large-scale benchmark datasets from different domains.

中文翻译:

可扩展自监督图表示学习的子图对比

图表示学习最近引起了很多关注。由于有限的计算和内存成本,以完整的图形数据为输入的现有图形神经网络不可扩展。因此,在大规模图形数据中捕获丰富的信息仍然是一个巨大的挑战。此外,这些方法主要侧重于监督学习,并且高度依赖节点标签信息,在现实世界中获取这些信息的成本很高。对于无监督网络嵌入方法,他们过分强调节点邻近性,其学习到的表示很难直接用于下游应用程序任务。近年来,新兴的自监督学习为解决上述问题提供了一种潜在的解决方案。然而,现有的自监督工作也对完整的图数据进行操作,并且在定义基于互信息的损失项时偏向于适应全局或非常局部(1 跳邻域)图结构。在本文中,通过利用中心节点与其采样子图之间的强相关性来捕获区域结构信息,提出了一种新的子图对比自监督表示学习方法,即 \textsc{Subg-Con}。\textsc{Subg-Con} 不是在完整的输入图数据上学习,而是通过一种新的数据增强策略,通过基于从原始图中采样的子图定义的对比损失来学习节点表示。与现有的图表示学习方法相比,\textsc{Subg-Con} 在较弱的监督要求、模型学习可扩展性和并行化方面具有突出的性能优势。与来自不同领域的多个真实世界大规模基准数据集的经典和最先进的图表示学习方法相比,大量实验验证了我们工作的有效性和效率。
更新日期:2020-10-09
down
wechat
bug