当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs
Pattern Analysis and Applications ( IF 3.9 ) Pub Date : 2021-01-04 , DOI: 10.1007/s10044-020-00946-w
Eric S. Tellez , Guillermo Ruiz , Edgar Chavez , Mario Graff

Nearest neighbor search is a powerful abstraction for data access; however, data indexing is troublesome even for approximate indexes. For intrinsically high-dimensional data, high-quality fast searches demand either indexes with impractically large memory usage or preprocessing time. In this paper, we introduce an algorithm to solve a nearest-neighbor query q by minimizing a kernel function defined by the distance from q to each object in the database. The minimization is performed using metaheuristics to solve the problem rapidly; even when some methods in the literature use this strategy behind the scenes, our approach is the first one using it explicitly. We also provide two approaches to select edges in the graph’s construction stage that limit memory footprint and reduce the number of free parameters simultaneously. We carry out a thorough experimental comparison with state-of-the-art indexes through synthetic and real-world datasets; we found out that our contributions achieve competitive performances regarding speed, accuracy, and memory in almost any of our benchmarks.



中文翻译:

通过邻居图上的局部搜索方法,可扩展的解决方案来解决最近的邻居搜索问题

最近邻居搜索是数据访问的强大抽象;但是,即使对于近似索引,数据索引也很麻烦。对于本质上的高维数据,高质量的快速搜索需要使用不切实际的大内存使用量或预处理时间的索引。在本文中,我们介绍了一种通过最小化由与q的距离定义的核函数来解决最近邻居查询q的算法到数据库中的每个对象。最小化是使用元启发法执行的,以快速解决问题;即使文献中的某些方法在后台使用了此策略,我们的方法也是第一个明确使用它的方法。我们还提供了两种在图形构造阶段选择边的方法,这些方法限制了内存占用并同时减少了可用参数的数量。我们通过综合和真实数据集与最新指标进行了全面的实验比较;我们发现,在几乎所有基准测试中,我们的贡献在速度,准确性和内存方面都取得了竞争性表现。

更新日期:2021-01-05
down
wechat
bug