当前位置: X-MOL 学术Distrib. Parallel. Databases › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A MapReduce approach for spatial co-location pattern mining via ordered-clique-growth
Distributed and Parallel Databases ( IF 1.5 ) Pub Date : 2019-12-02 , DOI: 10.1007/s10619-019-07278-7
Peizhong Yang , Lizhen Wang , Xiaoxuan Wang

Spatial co-location pattern is a subset of spatial features whose instances are frequently located together in geography. Mining co-location patterns are particularly valuable for discovering spatial dependencies. Traditional co-location pattern mining algorithms are computationally expensive with rapidly increasing of data volume. In this paper, we explore a novel iterative framework based on parallel ordered-clique-growth for co-location pattern mining. The ordered clique extension can re-use previously processed information and be executed in parallel, and hence speed up the identification of co-location instances. Based on the iterative framework, a MapReduce algorithm is designed to search for prevalent co-location patterns in a level-wise manner, namely PCPM_OC. To narrow the search space of ordered cliques, two pruning techniques are suggested for filtering invalid clique instances as much as possible. The completeness and correctness of PCPM_OC are proven and we also discuss its complexity in this paper. Moreover, we compare PCPM_OC with two advanced MapReduce based co-location pattern mining algorithms on multiple perspectives. At last, substantial experiments are conducted on synthetic and real-world spatial datasets to study the performance of PCPM_OC. Experimental results demonstrate that PCPM_OC has a significant improvement in efficiency and shows better scalability on massive spatial data.

中文翻译:

基于有序团增长的空间协同定位模式挖掘的 MapReduce 方法

空间协同定位模式是空间特征的子集,其实例在地理上经常位于一起。挖掘协同定位模式对于发现空间依赖性特别有价值。随着数据量的快速增加,传统的协同定位模式挖掘算法在计算上是昂贵的。在本文中,我们探索了一种基于并行有序团增长的新型迭代框架,用于协同定位模式挖掘。有序派系扩展可以重用先前处理过的信息并并行执行,从而加快co-location实例的识别。基于迭代框架,MapReduce 算法被设计为以水平方式搜索流行​​的共置模式,即 PCPM_OC。为了缩小有序团的搜索空间,建议使用两种修剪技术来尽可能过滤无效的 clique 实例。证明了 PCPM_OC 的完整性和正确性,我们也在本文中讨论了它的复杂性。此外,我们从多个角度将 PCPM_OC 与两种基于 MapReduce 的高级协同定位模式挖掘算法进行了比较。最后,对合成和真实空间数据集进行了大量实验,以研究 PCPM_OC 的性能。实验结果表明PCPM_OC在效率上有显着提升,在海量空间数据上表现出更好的可扩展性。我们从多个角度将 PCPM_OC 与两种基于 MapReduce 的高级协同定位模式挖掘算法进行比较。最后,对合成和真实空间数据集进行了大量实验,以研究 PCPM_OC 的性能。实验结果表明PCPM_OC在效率上有显着提升,在海量空间数据上表现出更好的可扩展性。我们从多个角度将 PCPM_OC 与两种基于 MapReduce 的高级协同定位模式挖掘算法进行比较。最后,对合成和真实空间数据集进行了大量实验,以研究 PCPM_OC 的性能。实验结果表明PCPM_OC在效率上有显着提升,在海量空间数据上表现出更好的可扩展性。
更新日期:2019-12-02
down
wechat
bug