当前位置:
X-MOL 学术
›
arXiv.cs.PF
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
An Efficient Graph Mining System for Large Patterns
arXiv - CS - Performance Pub Date : 2021-01-19 , DOI: arxiv-2101.07690 Peng Jiang, Rujia Wang, Bo Wu
arXiv - CS - Performance Pub Date : 2021-01-19 , DOI: arxiv-2101.07690 Peng Jiang, Rujia Wang, Bo Wu
There is a growing interest in designing systems for graph pattern mining in
recent years. The existing systems mostly focus on small patterns and have
difficulty in mining larger patterns. In this work, we propose Angelica, a
single-machine graph pattern mining system aiming at supporting large patterns.
We first propose a new computation model called multi-vertex exploration. The
model allows us to divide a large pattern mining task into smaller matching
tasks. Different from the existing systems which perform vertex-by-vertex
exploration, we explore larger subgraphs by joining small subgraphs. Based on
the new computation model, we further enhance the performance through an
index-based quick pattern technique that addresses the issue of expensive
isomorphism check, and approximate join that mitigates the issue of subgraph
explosion of large patterns. The experimental results show that Angelica
achieves significant speedups against the state-of-the-art graph pattern mining
systems and supports large pattern mining that none of the existing systems can
handle.
中文翻译:
大型模式的高效图挖掘系统
近年来,人们对设计用于图形模式挖掘的系统越来越感兴趣。现有的系统主要集中在小模式上,并且难以挖掘大模式。在这项工作中,我们提出了Angelica,这是一种旨在支持大型模式的单机图形模式挖掘系统。我们首先提出一种称为多顶点探索的新计算模型。该模型允许我们将大型模式挖掘任务划分为较小的匹配任务。与现有的执行逐顶点探索的系统不同,我们通过合并较小的子图来探索较大的子图。在新的计算模型的基础上,我们通过基于索引的快速模式技术进一步提高了性能,该技术解决了昂贵的同构检查问题,和近似联接,可以缓解大图案的子图爆炸问题。实验结果表明,当归与最先进的图形模式挖掘系统相比,当归实现了显着的加速,并支持现有系统无法处理的大型模式挖掘。
更新日期:2021-01-20
中文翻译:
大型模式的高效图挖掘系统
近年来,人们对设计用于图形模式挖掘的系统越来越感兴趣。现有的系统主要集中在小模式上,并且难以挖掘大模式。在这项工作中,我们提出了Angelica,这是一种旨在支持大型模式的单机图形模式挖掘系统。我们首先提出一种称为多顶点探索的新计算模型。该模型允许我们将大型模式挖掘任务划分为较小的匹配任务。与现有的执行逐顶点探索的系统不同,我们通过合并较小的子图来探索较大的子图。在新的计算模型的基础上,我们通过基于索引的快速模式技术进一步提高了性能,该技术解决了昂贵的同构检查问题,和近似联接,可以缓解大图案的子图爆炸问题。实验结果表明,当归与最先进的图形模式挖掘系统相比,当归实现了显着的加速,并支持现有系统无法处理的大型模式挖掘。