GIST: Distributed Training for Large-Scale Graph Convolutional Networks,arXiv - CS - Distributed, Parallel, and Cluster Computing

当前位置： X-MOL 学术 › arXiv.cs.DC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GIST: Distributed Training for Large-Scale Graph Convolutional Networks
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2021-02-20 , DOI: arxiv-2102.10424
Cameron R. Wolfe, Jingkang Yang, Arindam Chowdhury, Chen Dun, Artun Bayer, Santiago Segarra, Anastasios Kyrillidis

The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale in terms of both the size of the graph and the number of model parameters. These limitations are in stark contrast to the increasing scale (in data size and model size) of experiments in deep learning research. In this work, we propose GIST, a novel distributed approach that enables efficient training of wide (overparameterized) GCNs on large graphs. GIST is a hybrid layer and graph sampling method, which disjointly partitions the global model into several, smaller sub-GCNs that are independently trained across multiple GPUs in parallel. This distributed framework improves model performance and significantly decreases wall-clock training time. GIST seeks to enable large-scale GCN experimentation with the goal of bridging the existing gap in scale between graph machine learning and deep learning.

中文翻译：

GIST：大规模图卷积网络的分布式培训

图卷积网络（GCN）是图上机器学习的首选解决方案，但众所周知，就图的大小和模型参数的数量而言，其训练很难扩展。这些限制与深度学习研究中实验规模的扩大（数据大小和模型大小）形成鲜明对比。在这项工作中，我们提出了GIST，这是一种新颖的分布式方法，可以有效地训练大型图上的宽（超参数化）GCN。GIST是一种混合层和图形采样方法，可以将全局模型不相干地划分为几个较小的sub-GCN，这些子GCN在多个GPU上并行并行地进行训练。这种分布式框架提高了模型性能，并大大减少了挂钟训练时间。

更新日期：2021-02-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>