当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LSR-forest: An locality sensitive hashing-based approximate k-nearest neighbor query algorithm on high-dimensional uncertain data
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2020-04-27 , DOI: 10.1002/cpe.5795
Jiagang Wang 1 , Tu Qian 1 , Anbang Yang 1 , Hui Wang 1 , Jiangbo Qian 1
Affiliation  

Uncertain data is widely used in many practical applications, such as data cleaning, location-based services, privacy protection, and so on. With the development of technology, data has a tendency to high-dimensionality. The most common indexes for nearest neighbor search on uncertain data are the R-Tree and the KD-Tree. These indexes will inevitably bring about “curse of dimension.” Focus on this problem, article proposes a new hash algorithm, called the LSR-forest, which based on locality sensitive hashing and R-Tree, to solve the high-dimensional uncertain data approximate neighbor search problem. The LSR-forest can hash similar high-dimensional uncertain data into a same bucket with a high probability, and then constructs multiple R-Tree-based indexes for hashed buckets. When querying, it is possible to judge neighbors by checking the data in the hypercube which the query point is in. One can also adjust the query range automatically by different parameter of k. Many experiments on different datasets are presented in this article. The results show that LSR-forest has better effectiveness and efficiency than R-Tree on high-dimensional datasets.

中文翻译:

LSR-forest:一种基于局部敏感散列的高维不确定数据的近似k近邻查询算法

不确定数据广泛应用于许多实际应用中,例如数据清洗、基于位置的服务、隐私保护等。随着技术的发展,数据具有高维化的趋势。对不确定数据进行最近邻搜索最常用的索引是 R-Tree 和 KD-Tree。这些指标必然会带来“维度的诅咒”。针对这个问题,文章提出了一种新的哈希算法,称为LSR-forest,它基于局部敏感哈希和R-Tree,来解决高维不确定数据的近似邻域搜索问题。LSR-森林可以将相似的高维不确定数据以高概率哈希到同一个桶中,然后为哈希桶构建多个基于R-Tree的索引。查询时,ķ。本文介绍了针对不同数据集的许多实验。结果表明,在高维数据集上,LSR-forest 比 R-Tree 具有更好的有效性和效率。
更新日期:2020-04-27
down
wechat
bug