当前位置:
X-MOL 学术
›
arXiv.cs.DB
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Discovering Closed and Maximal Embedded Patterns from Large Tree Data
arXiv - CS - Databases Pub Date : 2020-12-26 , DOI: arxiv-2012.13685 Xiaoying Wu, Dimitri Theodoratos, Nikos Mamoulis
arXiv - CS - Databases Pub Date : 2020-12-26 , DOI: arxiv-2012.13685 Xiaoying Wu, Dimitri Theodoratos, Nikos Mamoulis
We address the problem of summarizing embedded tree patterns extracted from
large data trees. We do so by defining and mining closed and maximal embedded
unordered tree patterns from a single large data tree. We design an embedded
frequent pattern mining algorithm extended with a local closedness checking
technique. This algorithm is called {\em closedEmbTM-prune} as it eagerly
eliminates non-closed patterns. To mitigate the generation of intermediate
patterns, we devise pattern search space pruning rules to proactively detect
and prune branches in the pattern search space which do not correspond to
closed patterns. The pruning rules are accommodated into the extended embedded
pattern miner to produce a new algorithm, called {\em closedEmbTM-prune}, for
mining all the closed and maximal embedded frequent patterns from large data
trees. Our extensive experiments on synthetic and real large-tree datasets
demonstrate that, on dense datasets, {\em closedEmbTM-prune} not only generates
a complete closed and maximal pattern set which is substantially smaller than
that generated by the embedded pattern miner, but also runs much faster with
negligible overhead on pattern pruning.
中文翻译:
从大树数据中发现闭合和最大嵌入模式
我们解决了总结从大数据树中提取的嵌入式树模式的问题。我们通过从单个大数据树中定义和挖掘封闭和最大嵌入的无序树模式来做到这一点。我们设计了一种扩展的嵌入式频繁模式挖掘算法,该算法扩展了局部封闭性检查技术。该算法被称为{\ emclosedEmbTM-prune},因为它可以消除非闭合模式。为了减轻中间模式的生成,我们设计了模式搜索空间修剪规则,以主动检测并修剪模式搜索空间中不对应于封闭模式的分支。修剪规则被容纳到扩展的嵌入式模式挖掘器中,以产生一种称为{\ emclosedEmbTM-prune}的新算法,用于从大型数据树中挖掘所有封闭的和最大的嵌入式频繁模式。
更新日期:2020-12-29
中文翻译:
从大树数据中发现闭合和最大嵌入模式
我们解决了总结从大数据树中提取的嵌入式树模式的问题。我们通过从单个大数据树中定义和挖掘封闭和最大嵌入的无序树模式来做到这一点。我们设计了一种扩展的嵌入式频繁模式挖掘算法,该算法扩展了局部封闭性检查技术。该算法被称为{\ emclosedEmbTM-prune},因为它可以消除非闭合模式。为了减轻中间模式的生成,我们设计了模式搜索空间修剪规则,以主动检测并修剪模式搜索空间中不对应于封闭模式的分支。修剪规则被容纳到扩展的嵌入式模式挖掘器中,以产生一种称为{\ emclosedEmbTM-prune}的新算法,用于从大型数据树中挖掘所有封闭的和最大的嵌入式频繁模式。