当前位置: X-MOL 学术IEEE J. Solid-State Circuits › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A 7-nm Compute-in-Memory SRAM Macro Supporting Multi-Bit Input, Weight and Output and Achieving 351 TOPS/W and 372.4 GOPS
IEEE Journal of Solid-State Circuits ( IF 5.4 ) Pub Date : 2021-01-01 , DOI: 10.1109/jssc.2020.3031290
Mahmut E. Sinangil , Burak Erbagci , Rawan Naous , Kerem Akarvardar , Dar Sun , Win-San Khwa , Hung-Jen Liao , Yih Wang , Jonathan Chang

In this work, we present a compute-in-memory (CIM) macro built around a standard two-port compiler macro using foundry 8T bit-cell in 7-nm FinFET technology. The proposed design supports 1024 4 b $\times $ 4 b multiply-and-accumulate (MAC) computations simultaneously. The 4-bit input is represented by the number of read word-line (RWL) pulses, while the 4-bit weight is realized by charge sharing among binary-weighted computation caps. Each unit of computation cap is formed by the inherent cap of the sense amplifier (SA) inside the 4-bit Flash ADC, which saves area and minimizes kick-back effect. Access time is 5.5 ns with 0.8-V power supply at room temperature. The proposed design achieves energy efficiency of 351 TOPS/W and throughput of 372.4 GOPS. Implications of our design from neural network implementation and accuracy perspectives are also discussed.

中文翻译:

支持多位输入、权重和输出并实现 351 TOPS/W 和 372.4 GOPS 的 7 纳米内存计算 SRAM 宏

在这项工作中,我们展示了一个围绕标准双端口编译器宏构建的内存计算 (CIM) 宏,使用采用 7 纳米 FinFET 技术的代工厂 8T 位单元。建议的设计同时支持 1024 4 b $\times $ 4 b 乘法累加 (MAC) 计算。4 位输入由读取字线 (RWL) 脉冲的数量表示,而 4 位权重通过二进制加权计算上限之间的电荷共享来实现。每个计算容量单位由 4 位闪存 ADC 内部的感应放大器 (SA) 的固有容量构成,可节省面积并最大限度地减少反冲效应。在室温下使用 0.8V 电源时,访问时间为 5.5 ns。提议的设计实现了 351 TOPS/W 的能效和 372.4 GOPS 的吞吐量。
更新日期:2021-01-01
down
wechat
bug