当前位置:
X-MOL 学术
›
arXiv.cs.AR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis
arXiv - CS - Hardware Architecture Pub Date : 2020-09-16 , DOI: arxiv-2009.07692 Damla Senol Cali, Gurpreet S. Kalsi, Z\"ulal Bing\"ol, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, Onur Mutlu
arXiv - CS - Hardware Architecture Pub Date : 2020-09-16 , DOI: arxiv-2009.07692 Damla Senol Cali, Gurpreet S. Kalsi, Z\"ulal Bing\"ol, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, Onur Mutlu
Genome sequence analysis has enabled significant advancements in medical and
scientific areas such as personalized medicine, outbreak tracing, and the
understanding of evolution. Unfortunately, it is currently bottlenecked by the
computational power and memory bandwidth limitations of existing systems, as
many of the steps in genome sequence analysis must process a large amount of
data. A major contributor to this bottleneck is approximate string matching
(ASM). We propose GenASM, the first ASM acceleration framework for genome sequence
analysis. We modify the underlying ASM algorithm (Bitap) to significantly
increase its parallelism and reduce its memory footprint, and we design the
first hardware accelerator for Bitap. Our hardware accelerator consists of
specialized compute units and on-chip SRAMs that are designed to match the rate
of computation with memory capacity and bandwidth. We demonstrate that GenASM is a flexible, high-performance, and low-power
framework, which provides significant performance and power benefits for three
different use cases in genome sequence analysis: 1) GenASM accelerates read
alignment for both long reads and short reads. For long reads, GenASM
outperforms state-of-the-art software and hardware accelerators by 116x and
3.9x, respectively, while consuming 37x and 2.7x less power. For short reads,
GenASM outperforms state-of-the-art software and hardware accelerators by 111x
and 1.9x. 2) GenASM accelerates pre-alignment filtering for short reads, with
3.7x the performance of a state-of-the-art pre-alignment filter, while
consuming 1.7x less power and significantly improving the filtering accuracy.
3) GenASM accelerates edit distance calculation, with 22-12501x and 9.3-400x
speedups over the state-of-the-art software library and FPGA-based accelerator,
respectively, while consuming 548-582x and 67x less power.
中文翻译:
GenASM:用于基因组序列分析的高性能、低功耗近似字符串匹配加速框架
基因组序列分析在医学和科学领域取得了重大进展,例如个性化医疗、爆发追踪和对进化的理解。不幸的是,它目前受到现有系统的计算能力和内存带宽限制的瓶颈,因为基因组序列分析中的许多步骤必须处理大量数据。造成此瓶颈的一个主要因素是近似字符串匹配 (ASM)。我们提出了 GenASM,这是第一个用于基因组序列分析的 ASM 加速框架。我们修改了底层 ASM 算法 (Bitap) 以显着提高其并行性并减少其内存占用,并且我们为 Bitap 设计了第一个硬件加速器。我们的硬件加速器由专门的计算单元和片上 SRAM 组成,旨在使计算速率与内存容量和带宽相匹配。我们证明了 GenASM 是一个灵活、高性能和低功耗的框架,它为基因组序列分析中的三个不同用例提供了显着的性能和功耗优势:1) GenASM 加速了长读和短读的读比对。对于长读取,GenASM 的性能分别比最先进的软件和硬件加速器高 116 倍和 3.9 倍,同时功耗降低 37 倍和 2.7 倍。对于短读,GenASM 的性能比最先进的软件和硬件加速器高 111 倍和 1.9 倍。2) GenASM 加速了短读取的预对齐过滤,性能是最先进的预对齐过滤器的 3.7 倍,同时消耗 1.7 倍的功率并显着提高过滤精度。3) GenASM 加速编辑距离计算,与最先进的软件库和基于 FPGA 的加速器相比,速度分别提高了 22-12501 倍和 9.3-400 倍,同时功耗降低了 548-582 倍和 67 倍。
更新日期:2020-09-17
中文翻译:
GenASM:用于基因组序列分析的高性能、低功耗近似字符串匹配加速框架
基因组序列分析在医学和科学领域取得了重大进展,例如个性化医疗、爆发追踪和对进化的理解。不幸的是,它目前受到现有系统的计算能力和内存带宽限制的瓶颈,因为基因组序列分析中的许多步骤必须处理大量数据。造成此瓶颈的一个主要因素是近似字符串匹配 (ASM)。我们提出了 GenASM,这是第一个用于基因组序列分析的 ASM 加速框架。我们修改了底层 ASM 算法 (Bitap) 以显着提高其并行性并减少其内存占用,并且我们为 Bitap 设计了第一个硬件加速器。我们的硬件加速器由专门的计算单元和片上 SRAM 组成,旨在使计算速率与内存容量和带宽相匹配。我们证明了 GenASM 是一个灵活、高性能和低功耗的框架,它为基因组序列分析中的三个不同用例提供了显着的性能和功耗优势:1) GenASM 加速了长读和短读的读比对。对于长读取,GenASM 的性能分别比最先进的软件和硬件加速器高 116 倍和 3.9 倍,同时功耗降低 37 倍和 2.7 倍。对于短读,GenASM 的性能比最先进的软件和硬件加速器高 111 倍和 1.9 倍。2) GenASM 加速了短读取的预对齐过滤,性能是最先进的预对齐过滤器的 3.7 倍,同时消耗 1.7 倍的功率并显着提高过滤精度。3) GenASM 加速编辑距离计算,与最先进的软件库和基于 FPGA 的加速器相比,速度分别提高了 22-12501 倍和 9.3-400 倍,同时功耗降低了 548-582 倍和 67 倍。