当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Efficient Graph Mining System for Large Patterns
arXiv - CS - Performance Pub Date : 2021-01-19 , DOI: arxiv-2101.07690
Peng Jiang, Rujia Wang, Bo Wu

There is a growing interest in designing systems for graph pattern mining in recent years. The existing systems mostly focus on small patterns and have difficulty in mining larger patterns. In this work, we propose Angelica, a single-machine graph pattern mining system aiming at supporting large patterns. We first propose a new computation model called multi-vertex exploration. The model allows us to divide a large pattern mining task into smaller matching tasks. Different from the existing systems which perform vertex-by-vertex exploration, we explore larger subgraphs by joining small subgraphs. Based on the new computation model, we further enhance the performance through an index-based quick pattern technique that addresses the issue of expensive isomorphism check, and approximate join that mitigates the issue of subgraph explosion of large patterns. The experimental results show that Angelica achieves significant speedups against the state-of-the-art graph pattern mining systems and supports large pattern mining that none of the existing systems can handle.

中文翻译:

大型模式的高效图挖掘系统

近年来,人们对设计用于图形模式挖掘的系统越来越感兴趣。现有的系统主要集中在小模式上,并且难以挖掘大模式。在这项工作中,我们提出了Angelica,这是一种旨在支持大型模式的单机图形模式挖掘系统。我们首先提出一种称为多顶点探索的新计算模型。该模型允许我们将大型模式挖掘任务划分为较小的匹配任务。与现有的执行逐顶点探索的系统不同,我们通过合并较小的子图来探索较大的子图。在新的计算模型的基础上,我们通过基于索引的快速模式技术进一步提高了性能,该技术解决了昂贵的同构检查问题,和近似联接,可以缓解大图案的子图爆炸问题。实验结果表明,当归与最先进的图形模式挖掘系统相比,当归实现了显着的加速,并支持现有系统无法处理的大型模式挖掘。
更新日期:2021-01-20
down
wechat
bug