当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feluca: A Two-Stage Graph Coloring Algorithm with Color-centric Paradigm on GPU
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-01-01 , DOI: 10.1109/tpds.2020.3014173
Zhigao Zheng , Xuanhua Shi , Ligang He , Hai Jin , Shuo Wei , Hulin Dai , Xuan Peng

There are great challenges in performing graph coloring on GPU in general. First, the long-tail problem exists in the recursion algorithm because the conflict (i.e., different threads assign the adjacent nodes to the same color) becomes more likely to occur as the number of iterations increases. Second, it is hard to parallelize the sequential spread algorithm because the color allocation depends on the adjoining iteration. Third, the atomic operation is widely used on GPU to maintain the color list, which can greatly reduce the efficiency of GPU threads. In this article, we propose a two-stage high-performance graph coloring algorithm, called Feluca, aiming to address the above challenges. Feluca combines the recursion-based method with the sequential spread-based method. In the first stage, Feluca uses a recursive routine to color a majority of vertices in the graph. Then, it switches to the sequential spread method to color the remaining vertices in order to avoid the conflicts of the recursive algorithm. Moreover, the following techniques are proposed to further improve the graph coloring performance. i) A new method is proposed to eliminate the cycles in the graph; ii) a top-down scheme is developed to avoid the atomic operation originally required for color selection; and iii) a novel color-centric coloring paradigm is designed to improve the degree of parallelism for the sequential spread part. All these newly developed techniques, together with further GPU-specific optimizations such as coalesced memory access, comprise an efficient parallel graph coloring solution in Feluca. We have conducted extensive experiments on NVIDIA GPU. The results show that Feluca can achieve 1.19 – 8.39× speedup over the state-of-the-art algorithms.

中文翻译:

Feluca:一种在 GPU 上使用以颜色为中心的范式的两阶段图着色算法

一般来说,在 GPU 上执行图形着色存在很大的挑战。首先,递归算法中存在长尾问题,因为随着迭代次数的增加,冲突(即不同的线程将相邻节点分配到相同的颜色)变得更容易发生。其次,很难并行化顺序扩展算法,因为颜色分配取决于相邻迭代。第三,在GPU上广泛使用原子操作来维护颜色列表,这会大大降低GPU线程的效率。在本文中,我们提出了一种名为 Feluca 的两阶段高性能图着色算法,旨在解决上述挑战。Feluca 将基于递归的方法与基于顺序扩展的方法相结合。在第一阶段,Feluca 使用递归例程为图中的大多数顶点着色。然后,它切换到顺序扩展方法对剩余的顶点进行着色以避免递归算法的冲突。此外,还提出了以下技术来进一步提高图着色性能。i) 提出了一种消除图中循环的新方法;ii) 开发了自顶向下的方案,避免了原来颜色选择所需的原子操作;iii) 一种新的以颜色为中心的着色范式旨在提高顺序展开部分的并行度。所有这些新开发的技术,再加上进一步的 GPU 特定优化(例如合并内存访问),构成了 Feluca 中高效的并行图着色解决方案。我们在 NVIDIA GPU 上进行了大量实验。结果表明,与最先进的算法相比,Feluca 可以实现 1.19 – 8.39 倍的加速。
更新日期:2021-01-01
down
wechat
bug