A 7.3 M Output Non-Zeros/J, 11.7 M Output Non-Zeros/GB Reconfigurable Sparse Matrix-Matrix Multiplication Accelerator,IEEE Journal of Solid-State Circuits

当前位置： X-MOL 学术 › IEEE J. Solid-State Circuits › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A 7.3 M Output Non-Zeros/J, 11.7 M Output Non-Zeros/GB Reconfigurable Sparse Matrix-Matrix Multiplication Accelerator
IEEE Journal of Solid-State Circuits ( IF 5.4 ) Pub Date : 2020-04-01 , DOI: 10.1109/jssc.2019.2960480
Dong-Hyeon Park , Subhankar Pal , Siying Feng , Paul Gao , Jielun Tan , Austin Rovinski , Shaolin Xie , Chun Zhao , Aporva Amarnath , Timothy Wesley , Jonathan Beaumont , Kuan-Yu Chen , Chaitali Chakrabarti , Michael Bedford Taylor , Trevor Mudge , David Blaauw , Hun-Seok Kim , Ronald G. Dreslinski

A sparse matrix–matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40-nm CMOS. The compute fabric consists of dedicated floating-point multiplication units, and general-purpose Arm Cortex-M0 and Cortex-M4 cores. The on-chip memory reconfigures scratchpad or cache, depending on the phase of the algorithm. The memory and compute units are interconnected with synthesizable coalescing crossbars for efficient memory access. The 2.0-mm

$\times $

2.6-mm chip exhibits 12.6

$\times $

(8.4

$\times $

) energy efficiency gain, 11.7

$\times $

(77.6

$\times $

) off-chip bandwidth efficiency gain, and 17.1

$\times $

(36.9

$\times $

) compute density gain s against a high-end CPU (GPU) across a diverse set of synthetic and real-world power-law graph-based sparse matrices.

中文翻译：

7.3 M 输出非零/J、11.7 M 输出非零/GB 可重构稀疏矩阵-矩阵乘法加速器

具有 48 个异构内核和可重构存储器层次结构的稀疏矩阵-矩阵乘法 (SpMM) 加速器采用 40-nm CMOS 制造。计算结构由专用浮点乘法单元和通用 Arm Cortex-M0 和 Cortex-M4 内核组成。片上存储器根据算法的阶段重新配置暂存器或缓存。内存和计算单元通过可合成的合并交叉开关互连，以实现高效的内存访问。2.0 毫米

$\times $

2.6 毫米芯片显示 12.6

$\times $

(8.4

$\times $

) 能效增益，11.7

$\times $

(77.6

$\times $

) 片外带宽效率增益，以及 17.1

$\times $

(36.9