Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs,IEEE Transactions on Parallel and Distributed Systems

当前位置： X-MOL 学术 › IEEE Trans. Parallel Distrib. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-03-12 , DOI: 10.1109/tpds.2021.3065737
Youhui Bai ₁ , Cheng Li ₁ , Zhiqi Lin ₁ , Yufei Wu ₁ , Youshan Miao ₂ , Yunxin Liu ₂ , Yinlong Xu ₁

Affiliation

Emerging graph neural networks (GNNs) have extended the successes of deep learning techniques against datasets like images and texts to more complex graph-structured data. By leveraging GPU accelerators, existing frameworks combine mini-batch and sampling for effective and efficient model training on large graphs. However, this setup faces a scalability issue since loading rich vertex features from CPU to GPU through a limited bandwidth link usually dominates the training cycle. In this article, we propose PaGraph, a novel, efficient data loader that supports general and efficient sampling-based GNN training on single-server with multi-GPU. PaGraph significantly reduces the data loading time by exploiting available GPU resources to keep frequently-accessed graph data with a cache. It also embodies a lightweight yet effective caching policy that takes into account graph structural information and data access patterns of sampling-based GNN training simultaneously. Furthermore, to scale out on multiple GPUs, PaGraph develops a fast GNN-computation-aware partition algorithm to avoid cross-partition access during data-parallel training and achieves better cache efficiency. Finally, it overlaps data loading and GNN computation for further hiding loading costs. Evaluations on two representative GNN models, GCN and GraphSAGE, using two sampling methods, Neighbor and Layer-wise, show that PaGraph could eliminate the data loading time from the GNN training pipeline, and achieve up to 4.8× performance speedup over the state-of-the-art baselines. Together with preprocessing optimization, PaGraph further delivers up to 16.0× end-to-end speedup.

中文翻译：

高效的数据加载器，用于在大图上进行基于快速采样的GNN训练

新兴的图神经网络（GNN）将针对图像和文本等数据集的深度学习技术的成功扩展到了更复杂的图结构数据。通过利用GPU加速器，现有框架将小批量和采样结合在一起，可以对大型图形进行有效的模型训练。但是，此设置面临可扩展性问题，因为通过有限的带宽链接将丰富的顶点功能从CPU加载到GPU通常会支配训练周期。在本文中，我们提出了PaGraph，这是一种新颖，高效的数据加载器，可在具有多GPU的单服务器上支持常规且高效的基于采样的GNN训练。PaGraph通过利用可用的GPU资源来通过缓存保留经常访问的图形数据，从而显着减少了数据加载时间。它还体现了一种轻量级但有效的缓存策略，该策略同时考虑了基于采样的GNN训练的图形结构信息和数据访问模式。此外，为了在多个GPU上进行扩展，PaGraph开发了一种快速的GNN计算感知分区算法，以避免在数据并行训练期间进行跨分区访问，并提高了缓存效率。最后，它与数据加载和GNN计算重叠，以进一步隐藏加载成本。使用Neighbor和Layer-wise这两种采样方法对两个代表性GNN模型GCN和GraphSAGE进行的评估表明，PaGraph可以消除GNN训练管道中的数据加载时间，并在状态下达到4.8倍的性能提速。最先进的基准。结合预处理优化，PaGraph最多可提供16个。

更新日期：2021-04-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11