当前位置: X-MOL 学术J. Intell. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A performance evaluation framework for association mining in spatial data
Journal of Intelligent Information Systems ( IF 2.3 ) Pub Date : 2009-12-16 , DOI: 10.1007/s10844-009-0115-6
Qiang Wang 1 , Vasileios Megalooikonomou
Affiliation  

The evaluation of the process of mining associations is an important and challenging problem in database systems and especially those that store critical data and are used for making critical decisions. Within the context of spatial databases we present an evaluation framework in which we use probability distributions to model spatial regions, and Bayesian networks to model the joint probability distribution and the structural relationships among spatial and non-spatial predicates. We demonstrate the applicability of the proposed framework by evaluating representatives from two well-known approaches that are used for learning associations, i.e., dependency analysis (using statistical tests of independence) and Bayesian methods. By controlling the parameters of the framework we provide extensive comparative results of the performance of the two approaches. We obtain measures of recovery of known associations as a function of the number of samples used, the strength, number and type of associations in the model, the number of spatial predicates associated with a particular non-spatial predicate, the prior probabilities of spatial predicates, the conditional probabilities of the non-spatial predicates, the image registration error, and the parameters that control the sensitivity of the methods. In addition to performance we investigate the processing efficiency of the two approaches.

中文翻译:


空间数据关联挖掘的性能评估框架



挖掘关联过程的评估是数据库系统中一个重要且具有挑战性的问题,特别是那些存储关键数据并用于做出关键决策的数据库系统。在空间数据库的背景下,我们提出了一个评估框架,其中我们使用概率分布来建模空间区域,并使用贝叶斯网络来建模联合概率分布以及空间和非空间谓词之间的结构关系。我们通过评估用于学习关联的两种著名方法的代表来证明所提出的框架的适用性,即依赖性分析(使用独立性的统计测试)和贝叶斯方法。通过控制框架的参数,我们提供了两种方法性能的广泛比较结果。我们获得已知关联的恢复度量,作为所用样本数量、模型中关联的强度、数量和类型、与特定非空间谓词相关的空间谓词数量、空间谓词的先验概率的函数、非空间谓词的条件概率、图像配准误差以及控制方法灵敏度的参数。除了性能之外,我们还研究了两种方法的处理效率。
更新日期:2009-12-16
down
wechat
bug