当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Experimental Analysis of Locality Sensitive Hashing Techniques for High-Dimensional Approximate Nearest Neighbor Searches
arXiv - CS - Databases Pub Date : 2020-06-19 , DOI: arxiv-2006.11285
Omid Jafari, Parth Nagarkar

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many multimedia retrieval applications. Exact tree-based indexing approaches are known to suffer from the notorious curse of dimensionality for high-dimensional data. Approximate searching techniques sacrifice some accuracy while returning good enough results for faster performance. Locality Sensitive Hashing (LSH) is a very popular technique for finding approximate nearest neighbors in high-dimensional spaces. Apart from providing theoretical guarantees on the query results, one of the main benefits of LSH techniques is their good scalability to large datasets because they are external memory based. The most dominant costs for existing LSH techniques are the algorithm time and the index I/Os required to find candidate points. Existing works do not compare both of these dominant costs in their evaluation. In this experimental survey paper, we show the impact of both these costs on the overall performance of the LSH technique. We compare three state-of-the-art techniques on four real-world datasets, and show that, in contrast to recent works, C2LSH is still the state-of-the-art algorithm in terms of performance while achieving similar accuracy as its recent competitors.

中文翻译:

用于高维近似最近邻搜索的局部敏感哈希技术的实验分析

在高维空间中寻找最近邻是许多多媒体检索应用程序中的基本操作。众所周知,精确的基于树的索引方法会遭受臭名昭著的高维数据的维数灾难。近似搜索技术牺牲了一些准确性,同时返回足够好的结果以获得更快的性能。局部敏感哈希(LSH)是一种非常流行的技术,用于在高维空间中寻找近似最近邻。除了对查询结果提供理论保证外,LSH 技术的主要好处之一是它们对大型数据集的良好可扩展性,因为它们是基于外部存储器的。现有 LSH 技术最主要的成本是算法时间和寻找候选点所需的索引 I/O。现有作品在评估中并未比较这两种主要成本。在这篇实验调查论文中,我们展示了这些成本对 LSH 技术整体性能的影响。我们在四个真实世界的数据集上比较了三种最先进的技术,并表明,与最近的工作相比,C2LSH 在性能方面仍然是最先进的算法,同时实现了与其相似的准确度。最近的竞争对手。
更新日期:2020-06-23
down
wechat
bug