当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards efficient solutions of bitruss decomposition for large-scale bipartite graphs
The VLDB Journal ( IF 4.2 ) Pub Date : 2021-03-20 , DOI: 10.1007/s00778-021-00658-5
Kai Wang , Xuemin Lin , Lu Qin , Wenjie Zhang , Ying Zhang

In recent years, cohesive subgraph mining in bipartite graphs becomes a popular research topic. An important cohesive subgraph model k-bitruss is the maximal cohesive subgraph where each edge is contained in at least k butterflies (i.e., (2, 2)-bicliques). In this paper, we study the bitruss decomposition problem which aims to find all the k-bitrusses for \(k \ge 0\). The existing algorithms follow a bottom-up strategy which peels the edges with the lowest butterfly support iteratively. In this peeling process, these algorithms are time-consuming to enumerate all the supporting butterflies for each edge. To solve this issue, we propose a novel online index, the \(\mathsf {BE}\)-\(\mathsf {Index}\) which compresses butterflies into k-blooms (i.e., (2, k)-bicliques). Based on the \(\mathsf {BE}\)-\(\mathsf {Index}\), the new bitruss decomposition algorithm \(\mathsf {BiT}\)-\(\mathsf {BU}\) is proposed, along with two batch-based optimizations, to accomplish the butterfly enumeration of the peeling process efficiently. Furthermore, the \(\mathsf {BiT}\)-\(\mathsf {PC}\) algorithm is designed which is more efficient against handling the edges with high butterfly supports. Besides, we explore shared-memory parallel solutions to handle large graphs in a more efficient way. In the parallel algorithms, we propose effective techniques to reduce conflicts among threads. We theoretically show that our new algorithms significantly reduce the time complexities of the existing algorithms. In addition, extensive empirical evaluations are conducted on real-world datasets. The experimental results further validate the effectiveness of the bitruss model and demonstrate that our proposed solutions significantly outperform the state-of-the-art techniques by several orders of magnitude.



中文翻译:

寻求大型二部图的有位分解的有效解

近年来,二部图中的粘性子图挖掘已成为热门的研究主题。一个重要的内聚子图模型k -bitruss是最大内聚子图,其中每个边至少包含在k个蝴蝶(即(2,2)-双斜方)中。在本文中,我们研究了比特分解问题,旨在找到\(k \ ge 0 \)的所有k个比特。现有算法遵循自下而上的策略,该策略以迭代方式剥离具有最低蝴蝶支撑的边缘。在该剥离过程中,这些算法耗时耗时,无法枚举每个边缘的所有支撑蝶形。为解决此问题,我们提出了一种新颖的在线索引\(\ mathsf {BE} \) -\(\ mathsf {Index} \)将蝴蝶压缩为k个花朵(即(2,  k)-bicliques)。基于\(\ mathsf {BE} \) - \(\ mathsf {Index} \),提出了一种新的比特分解算法\(\ mathsf {BiT} \) - \(\ mathsf {BU} \),结合两个基于批处理的优化,可以有效地完成剥皮过程的蝶式枚举。此外,\(\ mathsf {BiT} \) - \(\ mathsf {PC} \)设计了一种算法,可以更有效地处理带有高蝶形支撑的边缘。此外,我们探索共享内存并行解决方案以更有效的方式处理大型图形。在并行算法中,我们提出了有效的技术来减少线程之间的冲突。从理论上讲,我们的新算法大大降低了现有算法的时间复杂度。此外,还对真实数据集进行了广泛的经验评估。实验结果进一步验证了bitruss模型的有效性,并证明了我们提出的解决方案明显比最新技术好几个数量级。

更新日期:2021-03-21
down
wechat
bug