Optimizing Graph Processing and Preprocessing with Hardware Assisted Propagation Blocking,arXiv - CS - Hardware Architecture

当前位置： X-MOL 学术 › arXiv.cs.AR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Optimizing Graph Processing and Preprocessing with Hardware Assisted Propagation Blocking
arXiv - CS - Hardware Architecture Pub Date : 2020-11-17 , DOI: arxiv-2011.08451
Vignesh Balaji, Brandon Lucia

Extensive prior research has focused on alleviating the characteristic poor cache locality of graph analytics workloads. However, graph pre-processing tasks remain relatively unexplored. In many important scenarios, graph pre-processing tasks can be as expensive as the downstream graph analytics kernel. We observe that Propagation Blocking (PB), a software optimization designed for SpMV kernels, generalizes to many graph analytics kernels as well as common pre-processing tasks. In this work, we identify the lingering inefficiencies of a PB execution on conventional multicores and propose architecture support to eliminate PB's bottlenecks, further improving the performance gains from PB. Our proposed architecture -- COBRA -- optimizes the PB execution of both graph processing and pre-processing alike to provide end-to-end speedups of up to 4.6x (3.5x on average).

中文翻译：

使用硬件辅助传播阻塞优化图处理和预处理

广泛的先前研究集中在减轻图形分析工作负载的特征性较差的缓存局部性上。然而，图预处理任务仍然相对未被探索。在许多重要场景中，图预处理任务可能与下游图分析内核一样昂贵。我们观察到传播阻塞 (PB)，一种为 SpMV 内核设计的软件优化，可以推广到许多图形分析内核以及常见的预处理任务。在这项工作中，我们确定了在传统多核上执行 PB 的持续低效率，并提出架构支持以消除 PB 的瓶颈，进一步提高 PB 的性能增益。我们提出的架构——COBRA——优化了图形处理和预处理的 PB 执行，以提供高达 4 的端到端加速。

更新日期：2020-11-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>