当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discovering Closed and Maximal Embedded Patterns from Large Tree Data
arXiv - CS - Databases Pub Date : 2020-12-26 , DOI: arxiv-2012.13685
Xiaoying Wu, Dimitri Theodoratos, Nikos Mamoulis

We address the problem of summarizing embedded tree patterns extracted from large data trees. We do so by defining and mining closed and maximal embedded unordered tree patterns from a single large data tree. We design an embedded frequent pattern mining algorithm extended with a local closedness checking technique. This algorithm is called {\em closedEmbTM-prune} as it eagerly eliminates non-closed patterns. To mitigate the generation of intermediate patterns, we devise pattern search space pruning rules to proactively detect and prune branches in the pattern search space which do not correspond to closed patterns. The pruning rules are accommodated into the extended embedded pattern miner to produce a new algorithm, called {\em closedEmbTM-prune}, for mining all the closed and maximal embedded frequent patterns from large data trees. Our extensive experiments on synthetic and real large-tree datasets demonstrate that, on dense datasets, {\em closedEmbTM-prune} not only generates a complete closed and maximal pattern set which is substantially smaller than that generated by the embedded pattern miner, but also runs much faster with negligible overhead on pattern pruning.

中文翻译:

从大树数据中发现闭合和最大嵌入模式

我们解决了总结从大数据树中提取的嵌入式树模式的问题。我们通过从单个大数据树中定义和挖掘封闭和最大嵌入的无序树模式来做到这一点。我们设计了一种扩展的嵌入式频繁模式挖掘算法,该算法扩展了局部封闭性检查技术。该算法被称为{\ emclosedEmbTM-prune},因为它可以消除非闭合模式。为了减轻中间模式的生成,我们设计了模式搜索空间修剪规则,以主动检测并修剪模式搜索空间中不对应于封闭模式的分支。修剪规则被容纳到扩展的嵌入式模式挖掘器中,以产生一种称为{\ emclosedEmbTM-prune}的新算法,用于从大型数据树中挖掘所有封闭的和最大的嵌入式频繁模式。
更新日期:2020-12-29
down
wechat
bug