Graph-Waving architecture: Efficient execution of graph applications on GPUs,Journal of Parallel and Distributed Computing

当前位置： X-MOL 学术 › J. Parallel Distrib. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Graph-Waving architecture: Efficient execution of graph applications on GPUs
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2020-10-27 , DOI: 10.1016/j.jpdc.2020.10.005
Ayse Yilmazer-Metin

Most existing graph frameworks for GPUs adopt a vertex-centric computing model where vertex to thread mapping is applied. When run with irregular graphs, we observe significant load imbalance within SIMD-groups using vertex to thread mapping. Uneven work distribution within SIMD-groups leads to low utilization of SIMD units and inefficient use of memory bandwidth. We introduce Graph-Waving (GW) architecture to improve support for many graph applications on GPUs. It uses vertex to SIMD-group mapping and Scalar-Waving as a mechanism for efficient execution. It also favors a narrow SIMD-group width with a clustered issue approach and reuse of instructions in the front-end. We thoroughly evaluate GW architecture using timing detailed GPGPU-sim simulator with several graph and non-graph benchmarks from a variety of benchmark suites. Our results show that GW architecture provides an average of 4.4x and a maximum of 10x speedup with graph applications, while it obtains 9% performance improvement with regular and 17% improvement with irregular benchmarks.

中文翻译：

波形图架构：在GPU上高效执行图形应用程序

现有的大多数GPU图形框架都采用以顶点为中心的计算模型，其中应用了顶点到线程的映射。当使用不规则图形运行时，我们使用顶点到线程的映射观察到SIMD组内的显着负载不平衡。SIMD组内的工作分配不均会导致SIMD单元利用率低下和内存带宽利用率低下。我们引入了图形波动（GW）架构，以改善对GPU上许多图形应用程序的支持。它使用顶点到SIMD组映射和标量波动作为有效执行的机制。它还倾向于采用群集发布方法和前端的指令重用，以缩小SIMD组的宽度。我们使用时序详细的GPGPU-sim模拟器以及来自各种基准套件的多个图形和非图形基准，全面评估GW架构。我们的结果表明，GW架构在图形应用程序中的平均速度提高了4.4倍，最大速度提高了10倍，而常规规则的性能提高了9％，非常规基准的性能提高了17％。

更新日期：2020-11-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11