当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient parallel A* search on multi-GPU system
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2021-04-23 , DOI: 10.1016/j.future.2021.04.011
Xin He , Yapeng Yao , Zhiwen Chen , Jianhua Sun , Hao Chen

A* search is a best-first search algorithm that is widely used in pathfinding and graph traversal. To meet the ever-increasing demand of performance, various high-performance architectures (e.g., multi-core CPU and GPU) have been explored to accelerate the A* search. However, the current GPU based A* search approaches are merely designed based on single-GPU architecture. Nowadays, the amount of data grows at an exponential rate, making it inefficient or even infeasible for the current A* to process the data sets entirely on a single GPU.

In this paper, we propose DA*, a parallel A* search algorithm based on the multi-GPU architecture. DA* enables the efficient acceleration of the A* algorithm using multiple GPUs with effective graph partitioning and data communication strategies. To make the most of the parallelism of multi-GPU architecture, in the state extension phase, we adopt the method of multiple priority queues for the open list, which allows multiple states being calculated in parallel. In addition, we use the parallel hashing of replacement and frontier search mechanism to address node duplication detection and memory bottlenecks respectively. The evaluation shows that DA* is effective and efficient in accelerating A* based computational tasks on the multi-GPU system. Compared to the state-of-the-art A* search algorithm based on a single GPU, our algorithm can achieve up to 3× performance speedup with four GPUs.



中文翻译:

在多GPU系统上高效的并行A *搜索

A *搜索是一种最佳优先搜索算法,已广泛用于寻路和图形遍历。为了满足不断增长的性能需求,已经探索了各种高性能架构(例如,多核CPU和GPU)来加速A *搜索。但是,当前基于GPU的A *搜索方法仅基于单GPU架构进行设计。如今,数据量正以指数级的速度增长,这使得当前的A *在单个GPU上完全处理数据集变得效率低下甚至不可行。

在本文中,我们提出了DA *,这是一种基于多GPU架构的并行A *搜索算法。DA *使用多个具有有效图形划分和数据通信策略的GPU,可实现A *算法的高效加速。为了充分利用多GPU架构的并行性,在状态扩展阶段,我们采用开放式多优先级队列的方法列表,允许并行计算多个状态。另外,我们使用替换和边界搜索机制的并行哈希来分别解决节点重复检测和内存瓶颈。评估显示,DA *在加速多GPU系统上基于A *的计算任务方面是有效且高效的。与基于单个GPU的最新A *搜索算法相比,我们的算法最多可以实现3× 四个GPU可以提高性能。

更新日期:2021-04-29
down
wechat
bug