当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Random projection-based auxiliary information can improve tree-based nearest neighbor search
Information Sciences Pub Date : 2020-09-03 , DOI: 10.1016/j.ins.2020.08.054
Omid Keivani , Kaushik Sinha

Nearest neighbor search using random projection trees has recently been shown to achieve superior performance, in terms of better accuracy while retrieving less number of data points, compared to locality sensitive hashing based methods. However, to achieve acceptable nearest neighbor search accuracy for large scale applications, where number of data points and/or number of features can be very large, it requires users to maintain, store and search through large number of such independent random projection trees, which may be undesirable for many practical applications. To address this issue, in this paper we present different search strategies to improve nearest neighbor search performance of a single random projection tree. Our approach exploits properties of single and multiple random projections, which allows us to store meaningful auxiliary information at internal nodes of a random projection tree as well as to design priority functions to guide the search process that results in improved nearest neighbor search performance. Empirical results on multiple real world datasets show that our proposed method significantly improves nearest neighbor search accuracy of a single tree compared to baseline methods.



中文翻译:

基于随机投影的辅助信息可以改善基于树的最近邻居搜索

与基于局部敏感的散列的方法相比,最近已证明使用随机投影树的最近邻居搜索可实现更高的性能,即具有更好的准确性,同时检索的数据点数量更少。然而,为了在数据点和/或特征数量可能非常大的大规模应用中获得可接受的最近邻居搜索精度,这要求用户维护,存储和搜索大量此类独立随机投影树,对于许多实际应用而言可能是不可取的。为了解决这个问题,本文提出了不同的搜索策略来提高单个随机投影树的最近邻搜索性能。我们的方法利用了单个和多个随机投影的特性,这使我们能够在随机投影树的内部节点上存储有意义的辅助信息,并设计优先级函数来指导搜索过程,从而改善最近邻搜索性能。对多个真实世界数据集的经验结果表明,与基线方法相比,我们提出的方法显着提高了单棵树的最近邻居搜索精度。

更新日期:2020-09-03
down
wechat
bug