当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis
arXiv - CS - Hardware Architecture Pub Date : 2020-09-16 , DOI: arxiv-2009.07692
Damla Senol Cali, Gurpreet S. Kalsi, Z\"ulal Bing\"ol, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, Onur Mutlu

Genome sequence analysis has enabled significant advancements in medical and scientific areas such as personalized medicine, outbreak tracing, and the understanding of evolution. Unfortunately, it is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome sequence analysis must process a large amount of data. A major contributor to this bottleneck is approximate string matching (ASM). We propose GenASM, the first ASM acceleration framework for genome sequence analysis. We modify the underlying ASM algorithm (Bitap) to significantly increase its parallelism and reduce its memory footprint, and we design the first hardware accelerator for Bitap. Our hardware accelerator consists of specialized compute units and on-chip SRAMs that are designed to match the rate of computation with memory capacity and bandwidth. We demonstrate that GenASM is a flexible, high-performance, and low-power framework, which provides significant performance and power benefits for three different use cases in genome sequence analysis: 1) GenASM accelerates read alignment for both long reads and short reads. For long reads, GenASM outperforms state-of-the-art software and hardware accelerators by 116x and 3.9x, respectively, while consuming 37x and 2.7x less power. For short reads, GenASM outperforms state-of-the-art software and hardware accelerators by 111x and 1.9x. 2) GenASM accelerates pre-alignment filtering for short reads, with 3.7x the performance of a state-of-the-art pre-alignment filter, while consuming 1.7x less power and significantly improving the filtering accuracy. 3) GenASM accelerates edit distance calculation, with 22-12501x and 9.3-400x speedups over the state-of-the-art software library and FPGA-based accelerator, respectively, while consuming 548-582x and 67x less power.

中文翻译:

GenASM:用于基因组序列分析的高性能、低功耗近似字符串匹配加速框架

基因组序列分析在医学和科学领域取得了重大进展,例如个性化医疗、爆发追踪和对进化的理解。不幸的是,它目前受到现有系统的计算能力和内存带宽限制的瓶颈,因为基因组序列分析中的许多步骤必须处理大量数据。造成此瓶颈的一个主要因素是近似字符串匹配 (ASM)。我们提出了 GenASM,这是第一个用于基因组序列分析的 ASM 加速框架。我们修改了底层 ASM 算法 (Bitap) 以显着提高其并行性并减少其内存占用,并且我们为 Bitap 设计了第一个硬件加速器。我们的硬件加速器由专门的计算单元和片上 SRAM 组成,旨在使计算速率与内存容量和带宽相匹配。我们证明了 GenASM 是一个灵活、高性能和低功耗的框架,它为基因组序列分析中的三个不同用例提供了显着的性能和功耗优势:1) GenASM 加速了长读和短读的读比对。对于长读取,GenASM 的性能分别比最先进的软件和硬件加速器高 116 倍和 3.9 倍,同时功耗降低 37 倍和 2.7 倍。对于短读,GenASM 的性能比最先进的软件和硬件加速器高 111 倍和 1.9 倍。2) GenASM 加速了短读取的预对齐过滤,性能是最先进的预对齐过滤器的 3.7 倍,同时消耗 1.7 倍的功率并显着提高过滤精度。3) GenASM 加速编辑距离计算,与最先进的软件库和基于 FPGA 的加速器相比,速度分别提高了 22-12501 倍和 9.3-400 倍,同时功耗降低了 548-582 倍和 67 倍。
更新日期:2020-09-17
down
wechat
bug