当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A High-Throughput FPGA Accelerator for Short-Read Mapping of the Whole Human Genome
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-01-12 , DOI: 10.1109/tpds.2021.3051011
Yen-Lung Chen , Bo-Yi Chang , Chia-Hsiang Yang , Tzi-DAR Chiueh

The mapping of DNA subsequences to a known reference genome, referred to as “short-read mapping”, is essential for next-generation sequencing. Hundreds of millions of short reads need to be aligned to a tremendously long reference sequence, making short-read mapping very time consuming. In this article, a high-throughput hardware accelerator is proposed so as to accelerate this task. A Bloom filter-based candidate mapping location (CML) generator and a folded processing element (PE) array are proposed to address CML selection and the Smith-Waterman (SW) alignment algorithm, respectively. It is shown that the proposed CML generator reduces the required memory access by 40 percent by employing a down-sampling scheme when compared to the Ferragina-Manzini index (FM-index) solution. The proposed hierarchical Bloom filter (HBF) that includes optimized parameters achieves a 1.5×10 4 times acceleration over the conventional Bloom filter. The proposed memory re-allocation scheme further reduces the memory access time for the HBF by a factor of 256. The proposed folded PE array delivers a 1.2-to-3.2 times higher giga cell updates per second (GCUPS). The processing time can be further reduced by 53-to-72 percent by employing a fully pipelined PE array that allows for a tailored shift amount for seeding. The accelerator is realized on a Stratix V GX FPGA with 16GB external SDRAM. Operated at 200MHz, the proposed FPGA accelerator delivers a 2.1-to-11 times higher throughput with the highest 99 percent accuracy and 98 percent sensitivity compared to the state-of-the-art FPGA-based solutions.

中文翻译:

高通量FPGA加速器,用于整个人类基因组的短读映射

DNA子序列到已知参考基因组的作图,称为“短读作图”,对于下一代测序至关重要。数以亿计的短读需要与非常长的参考序列对齐,这使得短读映射非常耗时。本文提出了一种高吞吐量的硬件加速器,以加速此任务。提出了一种基于布隆过滤器的候选映射位置(CML)生成器和一个折叠处理元素(PE)阵列,分别用于解决CML选择和Smith-Waterman(SW)对齐算法的问题。结果表明,与Ferragina-Manzini索引(FM-index)解决方案相比,通过采用下采样方案,建议的CML生成器将所需的内存访问量减少了40%。 是传统Bloom过滤器的4倍加速。拟议中的内存重新分配方案进一步将HBF的内存访问时间减少了256倍。拟议中的折叠式PE阵列每秒可提供1.2至3.2倍的千兆单元更新(GCUPS)。通过采用全流水线式PE阵列,可以为种子播种量身定制偏移量,从而将处理时间进一步减少53%至72%。该加速器在具有16GB外部SDRAM的Stratix V GX FPGA上实现。与基于FPGA的最新解决方案相比,拟议的FPGA加速器以200MHz的频率运行,吞吐量提高了2.1到11倍,具有最高的99%的准确性和98%的灵敏度。
更新日期:2021-02-02
down
wechat
bug