当前位置: X-MOL 学术arXiv.cs.DC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NextDoor: GPU-Based Graph Sampling for Graph Machine Learning
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-09-14 , DOI: arxiv-2009.06693
Abhinav Jangda, Sandeep Polisetty, Arjun Guha, Marco Serafini

Representation learning is a fundamental task in machine learning. It consists of learning the features of data items automatically, typically using a deep neural network (DNN), instead of selecting hand-engineered features that typically have worse performance. Graph data requires specific algorithms for representation learning such as DeepWalk, node2vec, and GraphSAGE. These algorithms first sample the input graph and then train a DNN based on the samples. It is common to use GPUs for training, but graph sampling on GPUs is challenging. Sampling is an embarrassingly parallel task since each sample can be generated independently. However, the irregularity of graphs makes it hard to use GPU resources effectively. Existing graph processing, mining, and representation learning systems do not effectively parallelize sampling and this negatively impacts the end-to-end performance of representation learning. In this paper, we present NextDoor, the first system specifically designed to perform graph sampling on GPUs. NextDoor introduces a high-level API based on a novel paradigm for parallel graph sampling called transit-parallelism. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems

中文翻译:

NextDoor:用于图机器学习的基于 GPU 的图采样

表示学习是机器学习中的一项基本任务。它包括自动学习数据项的特征,通常使用深度神经网络 (DNN),而不是选择通常性能较差的手工设计特征。图数据需要特定的表示学习算法,例如 DeepWalk、node2vec 和 GraphSAGE。这些算法首先对输入图进行采样,然后基于这些样本训练 DNN。使用 GPU 进行训练很常见,但 GPU 上的图形采样具有挑战性。采样是一项令人尴尬的并行任务,因为每个样本都可以独立生成。然而,图形的不规则性使得难以有效地利用 GPU 资源。现有的图处理、挖掘、和表示学习系统不能有效地并行采样,这会对表示学习的端到端性能产生负面影响。在本文中,我们介绍了 NextDoor,这是第一个专门设计用于在 GPU 上执行图形采样的系统。NextDoor 引入了一个高级 API,它基于一种称为传输并行的并行图采样的新范式。我们实现了几个图形采样应用程序,并表明 NextDoor 运行它们的速度比现有系统快几个数量级
更新日期:2020-09-21
down
wechat
bug