Parallelizing approximate single-source personalized PageRank queries on shared memory,The VLDB Journal

当前位置： X-MOL 学术 › VLDB J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Parallelizing approximate single-source personalized PageRank queries on shared memory
The VLDB Journal ( IF 2.8 ) Pub Date : 2019-10-08 , DOI: 10.1007/s00778-019-00576-7
Runhui Wang , Sibo Wang , Xiaofang Zhou

Given a directed graph G, a source node s, and a target node t, the personalized PageRank (PPR) \(\pi (s,t)\) measures the importance of node t with respect to node s. In this work, we study the single-source PPR query, which takes a source node s as input and outputs the PPR values of all nodes in G with respect to s. The single-source PPR query finds many important applications, e.g., community detection and recommendation. Deriving the exact answers for single-source PPR queries is prohibitive, so most existing work focuses on approximate solutions. Nevertheless, existing approximate solutions are still inefficient, and it is challenging to compute single-source PPR queries efficiently for online applications. This motivates us to devise efficient parallel algorithms running on shared-memory multi-core systems. In this work, we present how to efficiently parallelize the state-of-the-art index-based solution FORA, and theoretically analyze the complexity of the parallel algorithms. Theoretically, we prove that our proposed algorithm achieves a time complexity of \(O(W/P+\log ^2{n})\), where W is the time complexity of sequential FORA algorithm, P is the number of processors used, and n is the number of nodes in the graph. FORA includes a forward push phase and a random walk phase, and we present optimization techniques to both phases, including effective maintenance of active nodes, improving the efficiency of memory access, and cache-aware scheduling. Extensive experimental evaluation demonstrates that our solution achieves up to 37\(\times \) speedup on 40 cores and 3.3\(\times \) faster than alternatives on 40 cores. Moreover, the forward push alone can be used for local graph clustering, and our parallel algorithm for forward push is 4.8\(\times \) faster than existing parallel alternatives.

中文翻译：

在共享内存上并行化近似单源个性化PageRank查询

给定有向图G，源节点s和目标节点t，个性化PageRank（PPR）\（\ pi（s，t）\）测量节点t相对于节点s的重要性。在这项工作中，我们研究单源PPR查询，该查询将源节点s作为输入并输出G中所有节点相对于s的PPR值。单源PPR查询发现了许多重要的应用程序，例如社区检测和推荐。为单源PPR查询得出确切答案是令人望而却步的，因此，大多数现有工作都集中在近似解决方案上。然而，现有的近似解决方案仍然效率低下，并且为在线应用有效地计算单源PPR查询具有挑战性。这促使我们设计出在共享内存多核系统上运行的高效并行算法。在这项工作中，我们将介绍如何有效地并行化基于索引的最新解决方案FORA，并从理论上分析并行算法的复杂性。从理论上讲，我们证明了我们提出的算法实现了时间复杂度\（O（W / P + \ log ^ 2 {n}）\），其中W是顺序FORA算法的时间复杂度，P是使用的处理器数量，n是图中的节点数量。FORA包括前推阶段和随机遍历阶段，我们在这两个阶段都提供了优化技术，包括有效维护活动节点，提高内存访问效率和缓存感知调度。广泛的实验评估表明，我们的解决方案在40个内核上的速度提高了37倍（时间），在40个内核上的速度比其他解决方案快3.3倍（时间）。此外，仅向前推送可用于局部图聚类，而我们的向前推送并行算法为4.8 \（\ times \）比现有的并行替代方案更快。

更新日期：2019-10-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文