当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient structural node similarity computation on billion-scale graphs
The VLDB Journal ( IF 2.8 ) Pub Date : 2021-02-23 , DOI: 10.1007/s00778-021-00654-9
Xiaoshuang Chen , Longbin Lai , Lu Qin , Xuemin Lin

Structural node similarity is widely used in analyzing complex networks. As one of the structural node similarity metrics, role similarity has the good merit of indicating automorphism (isomorphism). Existing algorithms to compute role similarity (e.g., Role Sim and NED) suffer from severe performance bottlenecks and thus cannot handle large real-world graphs. In this paper, we propose a new framework, namely Struct Sim, to compute nodes’ role similarity. Under this framework, we first prove that Struct Sim is an admissible role similarity metric based on the maximum matching. While the maximum matching is still too costly to scale, we then devise the Bin Count matching that not only is efficient to compute but also guarantees the admissibility of Struct Sim. Bin Count-based Struct Sim admits a precomputed index to query a single pair of node in \(O(k\log D)\) time, where k is a small user-defined parameter and D is the maximum node degree. To build the index, we further devise an FM-sketch-based technique that can handle graphs with billions of edges. Extensive empirical studies show that Struct Sim performs much better than the existing works regarding both effectiveness and efficiency when applied to compute structural node similarities on the real-world graphs.



中文翻译:

十亿级图上的有效结构节点相似度计算

结构节点相似度被广泛用于分析复杂网络。作为结构节点相似性度量之一,角色相似性具有指示自同构(同构)的优点。现有的用于计算角色相似度的算法(例如Role SimNED)遭受严重的性能瓶颈,因此无法处理大型真实世界的图形。在本文中,我们提出了一个新的框架Struct Sim来计算节点的角色相似度。在此框架下,我们首先证明Struct Sim是基于最大匹配的可允许角色相似性度量。虽然最大匹配仍然太昂贵而无法扩展,但是我们然后设计了Bin Count匹配不仅可以高效计算,而且可以保证Struct Sim的可接纳性。基于Bin CountStruct Sim允许预先计算的索引在\(O(k \ log D)\)时间内查询一对节点,其中k是用户定义的小参数,D是最大节点度。为了建立索引,我们进一步设计了一种基于FM草图的技术,该技术可以处理数十亿条边的图形。大量的经验研究表明,Struct Sim在用于计算真实世界图上的结构节点相似度时,在有效性和效率方面都比现有工作表现更好。

更新日期:2021-02-23
down
wechat
bug