当前位置: X-MOL 学术IEEE J. Emerg. Sel. Top. Circuits Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DSIM: Distributed Sequence Matching on Near-DRAM Accelerator for Genome Assembly
IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( IF 3.7 ) Pub Date : 5-4-2022 , DOI: 10.1109/jetcas.2022.3172774
Aman Sinha , Huei-Chun Yang , Pei-Yi Liu , Yen-Shi Kuo , Yuhao Fang , Tien-Shuo Chang , Ke-Han Li , Bo-Cheng Lai

Matching nucleic acid sequences has long become the performance bottleneck in genome assembly which aims to connect enormous partial genome reads without prior knowledge of the reference sequence. The intensive and random data accesses of querying sequences using the widely adopted FM-Index data structure have caused in-efficient usage and long runtime of the memory system. Existing software FM-Index tools are limited on algorithmic inefficiency and poor processing parallelism. Solutions on GPU, FPGA and ASIC focus mainly on computational acceleration while still bottlenecked at the memory-bound nature of querying FM-Index. This paper proposes DSIM, a scalable FM-Index querying on near-DRAM accelerators. DSIM supports highly parallel multi-step query processing by distributing partial FM-Index table to different DRAM chips. Each genome sequence is partitioned into shorter queries and dispatched to the corresponding DRAM chip for string lookup. The optimized data layout and execution control on DRAM enables high row-data reuse and minimizes CPU-DRAM data transfers. The light-weight mapping scheme on the host CPU facilitates effective query distribution to DRAM chips and further supports scalability to multiple DIMMs (Dual-Inline Memory Modules). An in-DRAM arbiter is implemented to control the intra-chip data processing without affecting the external memory controller and DDR protocol. Experiments on 128-chip DRAM system showed that DSIM achieves up to 231 ×\times and 8.9 ×\times overall speedup compared to the software FM-Index tool and the state-of-the-art near-DRAM solution respectively.

中文翻译:


DSIM:用于基因组组装的近 DRAM 加速器上的分布式序列匹配



匹配核酸序列长期以来一直成为基因组组装的性能瓶颈,其目的是在事先不知道参考序列的情况下连接大量的部分基因组读数。使用广泛采用的FM-Index数据结构的查询序列的密集和随机数据访问导致内存系统的使用效率低下和运行时间长。现有的软件 FM-Index 工具受到算法效率低下和处理并行性差的限制。 GPU、FPGA 和 ASIC 上的解决方案主要关注计算加速,但仍然受到查询 FM-Index 的内存限制性质的瓶颈。本文提出了 DSIM,一种在近 DRAM 加速器上进行可扩展的 FM-Index 查询。 DSIM 通过将部分 FM-Index 表分布到不同的 DRAM 芯片来支持高度并行的多步查询处理。每个基因组序列被分割成较短的查询并分派到相应的 DRAM 芯片进行字符串查找。 DRAM 上优化的数据布局和执行控制可实现高行数据重用并最大限度地减少 CPU-DRAM 数据传输。主机CPU上的轻量级映射方案有利于将查询有效分配到DRAM芯片,并进一步支持多个DIMM(双列直插内存模块)的可扩展性。内置 DRAM 仲裁器用于控制片内数据处理,而不影响外部存储控制器和 DDR 协议。在 128 芯片 DRAM 系统上的实验表明,与软件 FM-Index 工具和最先进的近 DRAM 解决方案相比,DSIM 分别实现了高达 231 倍和 8.9 倍的整体加速。
更新日期:2024-08-28
down
wechat
bug