当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Block-Based Triangle Counting Algorithm on Heterogeneous Environments
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-06-29 , DOI: 10.1109/tpds.2021.3093240
Abdurrahman Yasar , Sivasankaran Rajamanickam , Jonathan W. Berry , Umit V. Catalyurek

Triangle counting is a fundamental building block in graph algorithms. In this article, we propose a block-based triangle counting algorithm to reduce data movement during both sequential and parallel execution. Our block-based formulation makes the algorithm naturally suitable for heterogeneous architectures. The problem of partitioning the adjacency matrix of a graph is well-studied. Our task decomposition goes one step further: it partitions the set of triangles in the graph. By streaming these small tasks to compute resources, we can solve problems that do not fit on a device. We demonstrate the effectiveness of our approach by providing an implementation on a compute node with multiple sockets, cores and GPUs. The current state-of-the-art in triangle enumeration processes the Friendster graph in 2.1 seconds, not including data copy time between CPU and GPU. Using that metric, our approach is 20 percent faster. When copy times are included, our algorithm takes 3.2 seconds. This is 5.6 times faster than the fastest published CPU-only time.

中文翻译:


异构环境下基于块的三角形计数算法



三角形计数是图算法中的基本构建块。在本文中,我们提出了一种基于块的三角形计数算法,以减少顺序和并行执行期间的数据移动。我们基于块的公式使该算法自然适合异构架构。图的邻接矩阵的划分问题已得到充分研究。我们的任务分解更进一步:它划分了图中的三角形集合。通过将这些小任务流式传输到计算资源,我们可以解决不适合设备的问题。我们通过在具有多个套接字、核心和 GPU 的计算节点上提供实现来证明我们方法的有效性。当前最先进的三角形枚举可以在 2.1 秒内处理 Friendster 图,不包括 CPU 和 GPU 之间的数据复制时间。使用该指标,我们的方法速度提高了 20%。当包括复制时间时,我们的算法需要 3.2 秒。这比已发布的最快仅 CPU 时间快 5.6 倍。
更新日期:2021-06-29
down
wechat
bug