当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Graph-Waving architecture: Efficient execution of graph applications on GPUs
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2020-10-27 , DOI: 10.1016/j.jpdc.2020.10.005
Ayse Yilmazer-Metin

Most existing graph frameworks for GPUs adopt a vertex-centric computing model where vertex to thread mapping is applied. When run with irregular graphs, we observe significant load imbalance within SIMD-groups using vertex to thread mapping. Uneven work distribution within SIMD-groups leads to low utilization of SIMD units and inefficient use of memory bandwidth. We introduce Graph-Waving (GW) architecture to improve support for many graph applications on GPUs. It uses vertex to SIMD-group mapping and Scalar-Waving as a mechanism for efficient execution. It also favors a narrow SIMD-group width with a clustered issue approach and reuse of instructions in the front-end. We thoroughly evaluate GW architecture using timing detailed GPGPU-sim simulator with several graph and non-graph benchmarks from a variety of benchmark suites. Our results show that GW architecture provides an average of 4.4x and a maximum of 10x speedup with graph applications, while it obtains 9% performance improvement with regular and 17% improvement with irregular benchmarks.



中文翻译:

波形图架构:在GPU上高效执行图形应用程序

现有的大多数GPU图形框架都采用以顶点为中心的计算模型,其中应用了顶点到线程的映射。当使用不规则图形运行时,我们使用顶点到线程的映射观察到SIMD组内的显着负载不平衡。SIMD组内的工作分配不均会导致SIMD单元利用率低下和内存带宽利用率低下。我们引入了图形波动(GW)架构,以改善对GPU上许多图形应用程序的支持。它使用顶点到SIMD组映射和标量波动作为有效执行的机制。它还倾向于采用群集发布方法和前端的指令重用,以缩小SIMD组的宽度。我们使用时序详细的GPGPU-sim模拟器以及来自各种基准套件的多个图形和非图形基准,全面评估GW架构。我们的结果表明,GW架构在图形应用程序中的平均速度提高了4.4倍,最大速度提高了10倍,而常规规则的性能提高了9%,非常规基准的性能提高了17%。

更新日期:2020-11-06
down
wechat
bug