当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM
arXiv - CS - Hardware Architecture Pub Date : 2021-05-26 , DOI: arxiv-2105.12839
Nastaran Hajinazar, Geraldo F. Oliveira, Sven Gregorio, João Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gómez Luna, Onur Mutlu

Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that (1) enables the efficient implementation of complex operations, and (2) provides a flexible mechanism to support the implementation of arbitrary user-defined operations. The SIMDRAM framework comprises three key steps. The first step builds an efficient MAJ/NOT representation of a given desired operation. The second step allocates DRAM rows that are reserved for computation to the operation's input and output operands, and generates the required sequence of DRAM commands to perform the MAJ/NOT implementation of the desired operation in DRAM. The third step uses the SIMDRAM control unit located inside the memory controller to manage the computation of the operation from start to end, by executing the DRAM commands generated in the second step of the framework. We design the hardware and ISA support for SIMDRAM framework to (1) address key system integration challenges, and (2) allow programmers to employ new SIMDRAM operations without hardware changes. We evaluate SIMDRAM for reliability, area overhead, throughput, and energy efficiency using a wide range of operations and seven real-world applications to demonstrate SIMDRAM's generality. Using 16 DRAM banks, SIMDRAM provides (1) 88x and 5.8x the throughput, and 257x and 31x the energy efficiency, of a CPU and a high-end GPU, respectively, over 16 operations; (2) 21x and 2.1x the performance of the CPU and GPU, over seven real-world applications. SIMDRAM incurs an area overhead of only 0.2% in a high-end CPU.

中文翻译:

SIMDRAM:DRAM 中位串行 SIMD 计算的端到端框架

已经针对有限的一组基本操作(即逻辑操作、加法)提出了使用 DRAM 的处理。然而,为了能够全面采用使用DRAM的处理,有必要为更复杂的操作提供支持。在本文中,我们提出了SIMDRAM,一种灵活的通用处理使用DRAM框架,该框架(1)能够有效地执行复杂的操作,并且(2)提供了一种灵活的机制来支持任意用户定义的操作的实现。SIMDRAM 框架包括三个关键步骤。第一步为给定的所需操作构建有效的 MAJ/NOT 表示。第二步将保留用于计算的 DRAM 行分配给操作的输入和输出操作数,并生成所需的 DRAM 命令序列,以在 DRAM 中执行所需操作的 MAJ/NOT 实现。第三步使用位于内存控制器内部的 SIMDRAM 控制单元,通过执行框架第二步中生成的 DRAM 命令,从头到尾管理操作的计算。我们设计了 SIMDRAM 框架的硬件和 ISA 支持,以 (1) 解决关键的系统集成挑战,以及 (2) 允许程序员在不更改硬件的情况下使用新的 SIMDRAM 操作。我们使用广泛的操作和七个实际应用程序来评估 SIMDRAM 的可靠性、面积开销、吞吐量和能源效率,以证明 SIMDRAM 的通用性。SIMDRAM 使用 16 个 DRAM 组,提供 (1) 88 倍和 5.8 倍的吞吐量,以及 257 倍和 31 倍的能效,一个CPU和一个高端GPU,分别超过16个运算;(2) CPU 和 GPU 的性能提高 21 倍和 2.1 倍,超过七个实际应用程序。SIMDRAM 在高端 CPU 中仅产生 0.2% 的面积开销。
更新日期:2021-05-28
down
wechat
bug