当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating Large-Scale Prioritized Graph Computations by Hotness Balanced Partition
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2021-04-01 , DOI: 10.1109/tpds.2020.3032709
Shufeng Gong , Yanfeng Zhang , Ge Yu

Prioritized computation is shown promising performance for a large class of graph algorithms. It prioritizes the execution of some vertices that play important roles in determining convergence. For large-scale distributed graph processing, graph partitioning is an important preprocessing step that aims to balance workload and to reduce communication costs between workers. However, existing graph partitioning methods are designed for round-robin synchronous distributed frameworks. They balance workload without distinction of vertex importance and fail to consider the characteristics of priority-based scheduling, which may limit the benefit of prioritized graph computation. In this article, to accelerate prioritized iterative graph computations, we propose Hotness Balanced Partition (HBP). In prioritized graph computation, high priority vertices are likely to be executed more frequently and are likely to pass more messages, which result in hot vertices. Based on this observation, we partition graph by distributing vertices with distinction according to their hotness rather than blindly distributing vertices with equal weights, aiming to evenly distribute the hot vertices among workers. We further provide two HBP algorithms: a streaming-based algorithm for efficient one-pass processing and a distributed algorithm for distributed processing. Our results show that our proposed partitioning methods outperform the state-of-the-art partitioning methods, Fennel, HotGraph, and SNE.

中文翻译:

通过热点平衡分区加速大规模优先图计算

优先计算在一大类图算法中表现出良好的性能。它优先执行某些在确定收敛中起重要作用的顶点。对于大规模分布式图处理,图分区是一个重要的预处理步骤,旨在平衡工作量并降低工作人员之间的通信成本。然而,现有的图分区方法是为循环同步分布式框架设计的。他们在不区分顶点重要性的情况下平衡工作量,并且没有考虑基于优先级调度的特性,这可能会限制优先图计算的好处。在本文中,为了加速优先迭代图计算,我们提出了 Hotness Balanced Partition (HBP)。在优先图计算中,高优先级的顶点可能会被更频繁地执行,并且可能会传递更多的消息,从而导致热点顶点。基于这一观察,我们通过根据热度有区别地分配顶点而不是盲目地分配权重相等的顶点来划分图,旨在将热顶点均匀地分配给工人。我们进一步提供了两种 HBP 算法:一种用于高效单次处理的基于流的算法和一种用于分布式处理的分布式算法。我们的结果表明,我们提出的分区方法优于最先进的分区方法 Fennel、HotGraph 和 SNE。我们通过根据它们的热度有区别地分配顶点而不是盲目地分配具有相等权重的顶点来划分图,旨在将热顶点均匀地分配给工人。我们进一步提供了两种 HBP 算法:一种用于高效单次处理的基于流的算法和一种用于分布式处理的分布式算法。我们的结果表明,我们提出的分区方法优于最先进的分区方法 Fennel、HotGraph 和 SNE。我们通过根据它们的热度有区别地分配顶点而不是盲目地分配具有相等权重的顶点来划分图,旨在将热顶点均匀地分配给工人。我们进一步提供了两种 HBP 算法:一种用于高效单次处理的基于流的算法和一种用于分布式处理的分布式算法。我们的结果表明,我们提出的分区方法优于最先进的分区方法 Fennel、HotGraph 和 SNE。
更新日期:2021-04-01
down
wechat
bug