当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph Processing
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2020-08-01 , DOI: 10.1109/tpds.2020.2973143
Xianghao Xu , Fang Wang , Hong Jiang , Yongli Cheng , Dan Feng , Yongxuan Zhang

In recent years, a number of out-of-core graph processing systems have been proposed to process graphs with billions of edges on just one commodity computer, due to their high cost efficiency. To obtain a better performance, these systems adopt a full I/O model that scans all edges during the computation to avoid the inefficiency of random I/Os. Although this model ensures good I/O access locality, it leads to a large number of useless edges to be loaded when running graph algorithms that only access a small portion of edges in each iteration. An intuitive method to solve this I/O inefficiency problem is the on-demand I/O model that only accesses the active edges. However, this method only works well for the graph algorithms with very few active edges, since the I/O cost will grow rapidly as the number of active edges increases due to the increasing amount of random I/Os. In this article, we present HUS-Graph, an efficient out-of-core graph processing system to address the above I/O issues and achieve a good balance between I/O traffic and I/O access locality. HUS-Graph adopts a hybrid update strategy including two update models, Row-oriented Push (ROP) and Column-oriented Pull (COP). It supports switching between ROP and COP adaptively, for the graph algorithms that have different computation and I/O features. For traversal-based algorithms, HUS-Graph also provides an immediate propagation-based vertex update scheme to accelerate the vertex state propagation and convergence speed. Furthermore, HUS-Graph adopts a locality-optimized dual-block representation to organize graph data and an I/O-based performance prediction method to enable the system to dynamically select the optimal update model between ROP and COP. To save the disk space and further reduce I/O traffic, HUS-Graph implements a space-efficient storage format by combining several graph compression methods. Extensive experimental results show that HUS-Graph outperforms two existing out-of-core systems GraphChi and GridGraph by 1.2x-52.8x.

中文翻译:

一种用于 I/O 高效的核外图处理的混合更新策略

近年来,由于成本效率高,人们提出了许多核外图处理系统,以在一台商用计算机上处​​理具有数十亿条边的图。为了获得更好的性能,这些系统采用全 I/O 模型,在计算过程中扫描所有边以避免随机 I/O 的低效率。虽然该模型保证了良好的 I/O 访问局部性,但在运行每次迭代仅访问一小部分边的图算法时,它会导致加载大量无用边。解决此 I/O 低效率问题的一种直观方法是仅访问活动边缘的按需 I/O 模型。然而,这种方法只适用于活动边很少的图算法,因为随着随机 I/O 数量的增加,I/O 成本会随着活动边缘数量的增加而迅速增长。在本文中,我们介绍了 HUS-Graph,这是一种高效的核外图形处理系统,可以解决上述 I/O 问题,并在 I/O 流量和 I/O 访问局部性之间取得良好的平衡。HUS-Graph采用混合更新策略,包括面向行的推送(ROP)和面向列的拉取(COP)两种更新模型。它支持在 ROP 和 COP 之间自适应切换,适用于具有不同计算和 I/O 特性的图算法。对于基于遍历的算法,HUS-Graph 还提供了基于即时传播的顶点更新方案,以加快顶点状态的传播和收敛速度。此外,HUS-Graph 采用局部优化的双块表示来组织图数据和基于 I/O 的性能预测方法,使系统能够在 ROP 和 COP 之间动态选择最优更新模型。为了节省磁盘空间并进一步减少 I/O 流量,HUS-Graph 通过结合多种图压缩方法实现了一种节省空间的存储格式。大量的实验结果表明,HUS-Graph 的性能比现有的两个核心外系统 GraphChi 和 GridGraph 高 1.2-52.8 倍。
更新日期:2020-08-01
down
wechat
bug