当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 12-28-2018 , DOI: 10.1109/tpami.2018.2889473
Yu A. Malkov , D. A. Yashunin

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures (typically used at the coarse search stage of the most proximity graph techniques). Hierarchical NSW incrementally builds a multi-layer structure consisting of a hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with an exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting the search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling. Additional employment of a heuristic for selecting proximity graph neighbors significantly increases performance at high recall and in case of highly clustered data. Performance evaluation has demonstrated that the proposed general metric space search index is able to strongly outperform previous opensource state-of-the-art vector-only approaches. Similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation.

中文翻译:


使用分层可导航小世界图进行高效且稳健的近似最近邻搜索



我们提出了一种基于具有可控层次结构的可导航小世界图(Hierarchical NSW,HNSW)的近似 K 最近邻搜索的新方法。所提出的解决方案完全基于图,不需要额外的搜索结构(通常在最邻近图技术的粗搜索阶段使用)。分层 NSW 逐步构建一个多层结构,该结构由存储元素的嵌套子集的一组分层邻近图(层)组成。存在元素的最大层是按照指数衰减概率分布随机选择的。这允许生成类似于之前研究的可导航小世界(NSW)结构的图表,同时另外还具有按其特征距离尺度分隔的链接。与 NSW 相比,从上层开始搜索并利用尺度分离可以提高性能,并允许对数复杂度缩放。额外使用启发式方法来选择邻近图邻居可以显着提高高召回率和高度聚类数据情况下的性能。性能评估表明,所提出的通用度量空间搜索索引能够远远优于以前的开源最先进的仅向量方法。该算法与跳跃列表结构的相似性允许简单的平衡分布式实现。
更新日期:2024-08-22
down
wechat
bug