当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast kNN query processing over a multi-node GPU environment
The Journal of Supercomputing ( IF 2.5 ) Pub Date : 2021-07-15 , DOI: 10.1007/s11227-021-03975-2
Ricardo J. Barrientos 1, 2 , Javier A. Riquelme 1, 2 , Ruber Hernández-García 1, 3 , Wladimir Soto-Silva 2 , Cristóbal A. Navarro 4
Affiliation  

The kNN (k nearest-neighbors) search is currently applied in a wide range of applications, such as data mining, multimedia, information retrieval, machine learning, pattern recognition, among others. Most of the solutions for this type of search are restricted to metric spaces or limited to use low dimension data. Our proposed algorithm uses as input a set of values (or measures) and returns the K lowest values from that set and can be used with measures obtained from metric and non-metric spaces or also from high dimensional databases. In this work, we introduce a novel GPU-based exhaustive algorithm to solve kNN queries, which is composed of two steps. The first is based on pivots to reduce the range of search, and the second one uses a set of heaps as auxiliary structures to return the final results. We also extended our algorithm to be able to use a multi-GPU platform and a multi-node/multi-GPU platform. To the best of our knowledge, taking account of the state-of-the-art technical literature, this work uses the most extensive database (in terms of data amount) to process a kNN query using up to 13,189 million of elements and achieving a speed-up up to 1843× when using a 5-nodes/20-GPUs platform.



中文翻译:

多节点 GPU 环境下的快速 kNN 查询处理

所述ķ NN(ķ近邻)搜索在广泛的应用范围,如数据挖掘,多媒体,信息检索,机器学习,模式识别,等等当前应用。大多数此类搜索的解决方案仅限于度量空间或仅限于使用低维数据。我们提出的算法使用一组值(或度量)作为输入,并从该集合中返回K 个最低值,并且可以与从度量和非度量空间或从高维数据库获得的度量一起使用。在这项工作中,我们引入了一种新颖的基于 GPU 的穷举算法来解决kNN 查询,由两个步骤组成。第一个是基于pivot来减少搜索范围,第二个是使用一组堆作为辅助结构返回最终结果。我们还扩展了我们的算法,使其能够使用多 GPU平台和多节点/多 GPU平台。据我们所知,考虑到最先进的技术文献,这项工作使用最广泛的数据库(就数据量而言)来处理使用多达 131.89 亿个元素的k NN 查询并实现使用5-nodes/20-GPUs平台时,速度提升高达 1843 倍。

更新日期:2021-07-15
down
wechat
bug