A New Perspective of Graph Data and A Generic and Efficient Method for Large Scale Graph Data Traversal,arXiv - CS - Distributed, Parallel, and Cluster Computing

当前位置： X-MOL 学术 › arXiv.cs.DC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A New Perspective of Graph Data and A Generic and Efficient Method for Large Scale Graph Data Traversal
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-09-16 , DOI: arxiv-2009.07463
Chenglong Zhang

The BFS algorithm is a basic graph data processing algorithm and many other graph data processing algorithms have similar architectural features with BFS algorithm and can be built on the basis of BFS algorithm model. We analyze the differences between graph algorithms and traditional high-performance algorithms in detail, propose a new way of classifying algorithms into data independent algorithm and data correlation algorithm based on their run-time correlation with data, and use this new classification to explain the validity of the methods proposed in this paper. Through a deeper analysis of graph data, we propose a new fundamental perspective on understanding graph data, establishing a link between two basic data structures, graph and tree, and viewing graph data as consisting of smaller subgraphs and edge trees. Small degree vertices are found to be one of important cause of random memory access. Based on this, we propose a general, easy to implement, and efficient method for graph data processing, with the basic idea of treating low-degree vertices and core subgraphs separately, thus significantly reducing the size of random memory access and improving the efficiency of memory access. Finally, we evaluated the performance of the method on three major data center computing platforms (Intel, AMD, and ARM), and the experiments showed that it brought 19.7%, 31.8% and 17.9% performance improvement, respectively, with a performance-power ratio of 282.70 MTEPS/s on the ARM platform, ranking it among the Green graph500 in November 2019. World No. 1 on the big dataset list.

中文翻译：

图数据的新视角和大规模图数据遍历的通用有效方法

BFS算法是一种基本的图数据处理算法，许多其他图数据处理算法与BFS算法具有相似的架构特征，可以建立在BFS算法模型的基础上。我们详细分析了图算法与传统高性能算法的区别，提出了一种基于算法与数据的运行时相关性将算法分为数据独立算法和数据相关算法的新方法，并用这种新分类来解释有效性本文提出的方法。通过对图数据的更深入分析，我们提出了理解图数据的新基本视角，在图和树这两个基本数据结构之间建立联系，并将图数据视为由较小的子图和边树组成。发现小度顶点是随机存储器访问的重要原因之一。基于此，我们提出了一种通用的、易于实现的、高效的图数据处理方法，基本思想是将低度顶点和核心子图分开处理，从而显着减少随机内存访问的大小，提高处理效率。内存访问。最后，我们评估了该方法在三大数据中心计算平台（Intel、AMD、ARM）上的性能，实验表明，它分别带来了 19.7%、31.8% 和 17.9% 的性能提升。 ARM平台282.70 MTEPS/s，2019年11月Green graph500，大数据集世界第一。我们提出了一种通用的、易于实现的、高效的图数据处理方法，其基本思想是将低度顶点和核心子图分开处理，从而显着减少随机内存访问的大小，提高内存访问效率。最后，我们评估了该方法在三大数据中心计算平台（Intel、AMD、ARM）上的性能，实验表明，它分别带来了 19.7%、31.8% 和 17.9% 的性能提升。 ARM平台282.70 MTEPS/s，2019年11月Green graph500，大数据集世界第一。我们提出了一种通用的、易于实现的、高效的图数据处理方法，其基本思想是将低度顶点和核心子图分开处理，从而显着减少随机内存访问的大小，提高内存访问效率。最后，我们评估了该方法在三大数据中心计算平台（Intel、AMD、ARM）上的性能，实验表明，它分别带来了 19.7%、31.8% 和 17.9% 的性能提升。 ARM平台282.70 MTEPS/s，2019年11月Green graph500，大数据集世界第一。从而显着减少随机内存访问的大小，提高内存访问的效率。最后，我们评估了该方法在三大数据中心计算平台（Intel、AMD、ARM）上的性能，实验表明，它分别带来了 19.7%、31.8% 和 17.9% 的性能提升。 ARM平台282.70 MTEPS/s，2019年11月Green graph500，大数据集世界第一。从而显着减少随机内存访问的大小，提高内存访问的效率。最后，我们评估了该方法在三大数据中心计算平台（Intel、AMD 和 ARM）上的性能，实验表明它分别带来了 19.7%、31.8% 和 17.9% 的性能提升，具有性能-功率ARM平台282.70 MTEPS/s，2019年11月Green graph500，大数据集世界第一。

更新日期：2020-09-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文