当前位置: X-MOL 学术Stat. Anal. Data Min. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Delaunay triangulation‐based spatial colocation pattern mining without distance thresholds
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2020-04-07 , DOI: 10.1002/sam.11457
Vanha Tran 1 , Lizhen Wang 1
Affiliation  

A spatial colocation pattern is a group of spatial features whose instances frequently appear together in close proximity to each other. The proximity of instances is generally measured by the distance between them. If the distance is smaller than a distance threshold that is specified by users, they have a neighbor relationship. However, it is difficult for users to give a suitable distance threshold and mining results also vary widely with different distance thresholds. In addition, using distance thresholds are hard to accurately obtain neighborhoods of instances in heterogeneous distribution density data sets. In this study, we propose a new method for determining the neighbor relationship of instances in space without the distance threshold based on Delaunay triangulation (DT). We design three filtering strategies, such as a feature invalid edge, a global positive edge, and a local positive edge, to constrain the original DT to accurately extract the neighborhoods of instances in space. Then, a miner called DT‐based colocation (DTC) pattern mining is developed. Different from the traditional algorithms which adopt the time‐consuming generate‐test candidate model, DTC directly collects the table instances of colocation patterns from the constrained DT by building neighboring polygons and filters prevalent patterns. We compare the results mined by DTC with by the traditional algorithms at macrolevel and microlevel on both real and synthetic data sets to prove that the DTC algorithm improves the effectiveness and fineness of mining results.

中文翻译:

没有距离阈值的基于Delaunay三角剖分的空间共置模式挖掘

空间共置模式是一组空间特征,其实例经常彼此靠近在一起出现。实例的接近度通常由它们之间的距离来衡量。如果距离小于用户指定的距离阈值,则他们具有邻居关系。但是,用户很难给出合适的距离阈值,并且挖掘结果也会随距离阈值的不同而大相径庭。此外,使用距离阈值很难准确地获取异构分布密度数据集中的实例邻域。在这项研究中,我们提出了一种新的方法,该方法基于Delaunay三角剖分(DT)确定没有距离阈值的空间中实例的邻居关系。我们设计了三种过滤策略,例如特征无效边,全局正边和局部正边,以约束原始DT以准确提取空间中实例的邻域。然后,开发了一种基于DT的共置(DTC)模式挖掘的矿机。与采用费时的生成测试候选模型的传统算法不同,DTC通过构建相邻的多边形并过滤流行的模式,直接从受约束的DT中收集主机模式的表实例。我们将DTC的结果与传统算法在宏观和微观上在真实和合成数据集上的结果进行了比较,以证明DTC算法提高了挖掘结果的有效性和精细度。约束原始DT以准确提取空间中实例的邻域。然后,开发了一种基于DT的共置(DTC)模式挖掘的矿机。与采用费时的生成测试候选模型的传统算法不同,DTC通过构建相邻的多边形并过滤流行的模式,直接从受约束的DT中收集主机模式的表实例。我们在实际和综合数据集上比较了DTC与传统算法在宏观和微观层面上挖掘的结果,以证明DTC算法提高了挖掘结果的有效性和精细度。约束原始DT以准确提取空间中实例的邻域。然后,开发了一种基于DT的共置(DTC)模式挖掘的矿机。与采用费时的生成测试候选模型的传统算法不同,DTC通过构建相邻的多边形并过滤流行的模式,直接从受约束的DT中收集主机模式的表实例。我们在实际和综合数据集上比较了DTC与传统算法在宏观和微观层面上挖掘的结果,以证明DTC算法提高了挖掘结果的有效性和精细度。DTC通过构建相邻的多边形并过滤流行的图案,直接从受约束的DT中收集托管图案的表实例。我们在实际和综合数据集上比较了DTC与传统算法在宏观和微观层面上挖掘的结果,以证明DTC算法提高了挖掘结果的有效性和精细度。DTC通过构建相邻的多边形并过滤流行的图案,直接从受约束的DT中收集托管图案的表实例。我们将DTC的结果与传统算法在宏观和微观上在真实和合成数据集上的结果进行了比较,以证明DTC算法提高了挖掘结果的有效性和精细度。
更新日期:2020-04-07
down
wechat
bug