当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GNNIE: GNN Inference Engine with Load-balancing and Graph-Specific Caching
arXiv - CS - Hardware Architecture Pub Date : 2021-05-21 , DOI: arxiv-2105.10554
Sudipta Mondal, Susmita Dey Manasi, Kishor Kunal, Sachin S. Sapatnekar

Analysis engines based on Graph Neural Networks (GNNs) are vital for many real-world problems that model relationships using large graphs. Challenges for a GNN hardware platform include the ability to (a) host a variety of GNNs, (b) handle high sparsity in input node feature vectors and the graph adjacency matrix and the accompanying random memory access patterns, and (c) maintain load-balanced computation in the face of uneven workloads induced by high sparsity and power-law vertex degree distributions in real datasets. The proposes GNNIE, an accelerator designed to run a broad range of GNNs. It tackles workload imbalance by (i) splitting node feature operands into blocks, (ii) reordering and redistributing computations, and (iii) using a flexible MAC architecture with low communication overheads among the processing elements. In addition, it adopts a graph partitioning scheme and a graph-specific caching policy that efficiently uses off-chip memory bandwidth that is well suited to the characteristics of real-world graphs. Random memory access effects are mitigated by partitioning and degree-aware caching to enable the reuse of high-degree vertices. GNNIE achieves average speedups of over 8890x over a CPU and 295x over a GPU over multiple datasets on graph attention networks (GATs), graph convolutional networks (GCNs), GraphSAGE, GINConv, and DiffPool, Compared to prior approaches, GNNIE achieves an average speedup of 9.74x over HyGCN for GCN, GraphSAGE, and GINConv; HyGCN cannot implement GATs. GNNIE achieves an average speedup of 2.28x over AWB-GCN (which runs only GCNs), despite using 3.4x fewer processing units.

中文翻译:

GNNIE:具有负载平衡和特定于图形的缓存的GNN推理引擎

基于图神经网络(GNN)的分析引擎对于许多使用大型图对关系建模的现实世界问题至关重要。GNN硬件平台面临的挑战包括以下能力:(a)托管各种GNN,(b)处理输入节点特征向量和图邻接矩阵以及随附的随机内存访问模式中的高稀疏性,以及(c)保持负载面对高稀疏性和幂律顶点度分布在实际数据集中引起的工作量不均衡,实现平衡计算。提出了GNNIE,一种旨在运行多种GNN的加速器。它通过(i)将节点特征操作数拆分为块,(ii)重新排序和重新分配计算,以及(iii)使用处理单元之间通信开销较低的灵活MAC体系结构来解决工作负载不平衡的问题。此外,它采用图分区方案和图特定的缓存策略,可以有效地使用片外内存带宽,该带宽非常适合于现实世界中的图的特征。通过分区和可感知度的缓存可以减轻对随机存储器访问的影响,从而可以重用高度顶点。在图注意力网络(GAT),图卷积网络(GCN),GraphSAGE,GINConv和DiffPool的多个数据集上,GNNIE在CPU上的平均提速超过了8890倍,在GPU上的平均提速是295倍。与以前的方法相比,GNNIE达到了平均提速。 GCN,GraphSAGE和GINConv的HyGCN的9.74倍;HyGCN无法实施GAT。尽管使用的处理单元减少了3.4倍,但GNNIE的平均速度比AWB-GCN(仅运行GCN)高出2.28倍。
更新日期:2021-05-25
down
wechat
bug