当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance analysis of a dual-tree algorithm for computing spatial distance histograms.
The VLDB Journal ( IF 2.8 ) Pub Date : 2010-10-28 , DOI: 10.1007/s00778-010-0205-7
Shaoping Chen 1 , Yi-Cheng Tu , Yuni Xia
Affiliation  

Many scientific and engineering fields produce large volume of spatiotemporal data. The storage, retrieval, and analysis of such data impose great challenges to database systems design. Analysis of scientific spatiotemporal data often involves computing functions of all point-to-point interactions. One such analytics, the Spatial Distance Histogram (SDH), is of vital importance to scientific discovery. Recently, algorithms for efficient SDH processing in large-scale scientific databases have been proposed. These algorithms adopt a recursive tree-traversing strategy to process point-to-point distances in the visited tree nodes in batches, thus require less time when compared to the brute-force approach where all pairwise distances have to be computed. Despite the promising experimental results, the complexity of such algorithms has not been thoroughly studied. In this paper, we present an analysis of such algorithms based on a geometric modeling approach. The main technique is to transform the analysis of point counts into a problem of quantifying the area of regions where pairwise distances can be processed in batches by the algorithm. From the analysis, we conclude that the number of pairwise distances that are left to be processed decreases exponentially with more levels of the tree visited. This leads to the proof of a time complexity lower than the quadratic time needed for a brute-force algorithm and builds the foundation for a constant-time approximate algorithm. Our model is also general in that it works for a wide range of point spatial distributions, histogram types, and space-partitioning options in building the tree.

中文翻译:

一种计算空间距离直方图的双树算法的性能分析。

许多科学和工程领域会产生大量的时空数据。此类数据的存储、检索和分析对数据库系统设计提出了巨大的挑战。对科学时空数据的分析往往涉及所有点对点交互的计算功能。其中一种分析方法,即空间距离直方图 (SDH),对于科学发现至关重要。最近,已经提出了在大规模科学数据库中进行有效 SDH 处理的算法。这些算法采用递归树遍历策略来批量处理访问过的树节点中的点到点距离,因此与必须计算所有成对距离的蛮力方法相比,需要的时间更少。尽管实验结果令人鼓舞,此类算法的复杂性尚未得到彻底研究。在本文中,我们基于几何建模方法对此类算法进行了分析。主要技术是将点计数的分析转化为量化区域面积的问题,在这些区域中,算法可以批量处理成对距离。从分析中,我们得出结论,要处理的成对距离的数量随着访问的树级别的增加呈指数下降。这证明了时间复杂度低于蛮力算法所需的二次时间,并为恒定时间近似算法奠定了基础。我们的模型也是通用的,因为它适用于构建树时的各种点空间分布、直方图类型和空间分区选项。
更新日期:2010-10-28
down
wechat
bug