当前位置: X-MOL 学术Comput. Math. Organ. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
XM-tree: data driven computational model by using metric extended nodes with non-overlapping in high-dimensional metric spaces
Computational and Mathematical Organization Theory ( IF 1.8 ) Pub Date : 2018-04-18 , DOI: 10.1007/s10588-018-9272-x
Zineddine Kouahla , Adeel Anjum , Sheeraz Akram , Tanzila Saba , José Martinez

Finding similar objects based on a query and a distance, remains a fundamental problem for many applications. The general problem of many similarity measures is to focus the search on as few elements as possible to find the answer. The index structures divides the target dataset into subsets. With large amounts of data, the volumes of the subspaces grow exponentially, that will affect the search algorithms. This problem is caused by inherent deficiencies of space partitioning, and also, the overlap factor between regions. This methods have proven to be unreliable, it becomes hard to store, manage, and analyze these quantities. The research tends to degenerate into a complete analysis of the data set. In this paper, we propose a new indexing technique called XM-tree, that partitions the space using spheres. The idea is to combine two structures, arborescent and sequential, in order to limit the volume of the outer regions of the spheres, by creating extended regions and inserting them into linked lists named extended regions, and also by excluding of the empty sets—separable partitions—that do not contain objects. The goal is to eliminate some objects without the need to compute their relative distances to a query object. Therefore, we proposed a parallel version of the structure on a set of real machine. We also discuss the efficiency of the construction and querying phases, and the quality of our index by comparing it with recent techniques.

中文翻译:

XM树:通过在高维度量空间中使用不重叠的度量扩展节点来进行数据驱动的计算模型

基于查询和距离查找相似对象仍然是许多应用程序的基本问题。许多相似性度量的普遍问题是将搜索集中在尽可能少的元素上以找到答案。索引结构将目标数据集划分为子集。随着大量数据的出现,子空间的体积呈指数增长,这将影响搜索算法。此问题是由空间划分的固有缺陷以及区域之间的重叠因子引起的。事实证明,这种方法不可靠,难以存储,管理和分析这些数量。研究趋于退化为对数据集的完整分析。在本文中,我们提出了一种称为XM-tree的新索引技术,该技术使用球体对空间进行分区。这个想法是结合两个结构,为了限制球体外部区域的体积,可通过创建扩展区域并将其插入命名为扩展区域的链接列表,以及排除不包含对象的空集(可分离分区)来限制球形外部区域的体积。目的是消除某些对象,而无需计算它们与查询对象的相对距离。因此,我们在一组真实机器上提出了该结构的并行版本。通过与最新技术进行比较,我们还讨论了构建和查询阶段的效率以及索引的质量。并且还排除了不包含对象的空集(可分离的分区)。目的是消除某些对象,而无需计算它们与查询对象的相对距离。因此,我们在一组真实机器上提出了该结构的并行版本。通过与最新技术进行比较,我们还讨论了构建和查询阶段的效率以及索引的质量。并且还排除了不包含对象的空集(可分离的分区)。目的是消除某些对象,而无需计算它们与查询对象的相对距离。因此,我们在一组真实机器上提出了该结构的并行版本。通过与最新技术进行比较,我们还讨论了构建和查询阶段的效率以及索引的质量。
更新日期:2018-04-18
down
wechat
bug