当前位置: X-MOL 学术J. Comput. Sci. Tech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PIM-Align: A Processing-in-Memory Architecture for FM-Index Search Algorithm
Journal of Computer Science and Technology ( IF 1.9 ) Pub Date : 2021-01-30 , DOI: 10.1007/s11390-020-0825-3
Xue-Qi Li , Guang-Ming Tan , Ning-Hui Sun

Genomic sequence alignment is the most critical and time-consuming step in genomic analysis. Alignment algorithms generally follow a seed-and-extend model. Acceleration of the extension phase for sequence alignment has been well explored in computing-centric architectures on field-programmable gate array (FPGA), application-specific integrated circuit (ASIC), and graphics processing unit (GPU) (e.g., the Smith-Waterman algorithm). Compared with the extension phase, the seeding phase is more critical and essential. However, the seeding phase is bounded by memory, i.e., fine-grained random memory access and limited parallelism on conventional system. In this paper, we argue that the processing-in-memory (PIM) concept could be a viable solution to address these problems. This paper describes “PIM-Align”—application-driven near-data processing architecture for sequence alignment. In order to achieve memory-capacity proportional performance by taking advantage of 3D-stacked dynamic random access memory (DRAM) technology, we propose a lightweight message mechanism between different memory partitions, and a specialized hardware prefetcher for memory access patterns of sequence alignment. Our evaluation shows that the proposed architecture can achieve 20x and 1 820x speedup when compared with the best available ASIC implementation and the software running on 32-thread CPU, respectively.



中文翻译:

PIM-Align:FM索引搜索算法的内存中处理架构

基因组序列比对是基因组分析中最关键,最耗时的步骤。对齐算法通常遵循种子和扩展模型。在以现场可编程门阵列(FPGA),专用集成电路(ASIC)和图形处理单元(GPU)(例如Smith-Waterman)为中心的计算中心架构中,已经很好地探索了用于序列比对的扩展阶段的加速算法)。与扩展阶段相比,播种阶段更为关键和必要。但是,种子阶段受内存限制,即常规系统上的细粒度随机内存访问和有限的并行性。在本文中,我们认为内存处理(PIM)概念可能是解决这些问题的可行解决方案。本文介绍了“ PIM-Align” —一种由应用程序驱动的近数据处理架构,用于序列比对。为了通过利用3D堆栈动态随机存取存储器(DRAM)技术来实现存储器容量成比例的性能,我们提出了一种在不同存储器分区之间的轻量级消息机制,以及一种用于序列比对的存储器访问模式的专用硬件预取器。我们的评估表明,与最佳可用的ASIC实现和在32线程CPU上运行的软件相比,所提出的体系结构可以分别实现20倍和1820倍的加速。我们提出了在不同内存分区之间的轻量级消息机制,以及用于序列比对的内存访问模式的专用硬件预取器。我们的评估表明,与最佳可用的ASIC实现和在32线程CPU上运行的软件相比,所提出的体系结构可以分别实现20倍和1820倍的加速。我们提出了在不同内存分区之间的轻量级消息机制,以及用于序列比对的内存访问模式的专用硬件预取器。我们的评估表明,与最佳可用的ASIC实现和在32线程CPU上运行的软件相比,所提出的体系结构可以分别实现20倍和1820倍的加速。

更新日期:2021-02-07
down
wechat
bug