当前位置: X-MOL 学术IEEE J. Solid-State Circuits › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors
IEEE Journal of Solid-State Circuits ( IF 4.6 ) Pub Date : 2020-10-01 , DOI: 10.1109/jssc.2020.3005754
Yen-Cheng Chiu , Zhixiao Zhang , Jia-Jing Chen , Xin Si , Ruhui Liu , Yung-Ning Tu , Jian-Wei Su , Wei-Hsing Huang , Jing-Hong Wang , Wei-Chen Wei , Je-Min Hung , Shyh-Shyuan Sheu , Sih-Han Li , Chih-I Wu , Ren-Shuo Liu , Chih-Cheng Hsieh , Kea-Tiong Tang , Meng-Fan Chang

Previous SRAM-based computing-in-memory (SRAM-CIM) macros suffer small read margins for high-precision operations, large cell array area overhead, and limited compatibility with many input and weight configurations. This work presents a 1-to-8-bit configurable SRAM CIM unit-macro using: 1) a hybrid structure combining 6T-SRAM based in-memory binary product-sum (PS) operations with digital near-memory-computing multibit PS accumulation to increase read accuracy and reduce area overhead; 2) column-based place-value-grouped weight mapping and a serial-bit input (SBIN) mapping scheme to facilitate reconfiguration and increase array efficiency under various input and weight configurations; 3) a self-reference multilevel reader (SRMLR) to reduce read-out energy and achieve a sensing margin 2 $\times $ that of the mid-point reference scheme; and 4) an input-aware bitline voltage compensation scheme to ensure successful read operations across various input-weight patterns. A 4-Kb configurable 6T-SRAM CIM unit-macro was fabricated using a 55-nm CMOS process with foundry 6T-SRAM cells. The resulting macro achieved access times of 3.5 ns per cycle (pipeline) and energy efficiency of 0.6–40.2 TOPS/W under binary to 8-b input/8-b weight precision.

中文翻译:

用于基于 CNN 的 AI 边缘处理器的 4 Kb 1 至 8 位可配置 6T SRAM 内存计算单元宏

以前的基于 SRAM 的内存计算 (SRAM-CIM) 宏在高精度操作方面具有较小的读取余量、较大的单元阵列面积开销以及与许多输入和权重配置的有限兼容性。这项工作提出了一个 1 到 8 位可配置的 SRAM CIM 单元宏,使用:1) 一种混合结构,结合了基于 6T-SRAM 的内存中二进制乘积和 (PS) 操作与数字近内存计算多位 PS 累加提高读取精度并减少区域开销;2) 基于列的位值分组权重映射和串行位输入 (SBIN) 映射方案,以促进在各种输入和权重配置下重新配置并提高阵列效率;3) 自参考多级阅读器 (SRMLR),以减少读出能量并实现中点参考方案的 2 倍的传感裕度;和 4) 输入感知位线电压补偿方案,以确保跨各种输入权重模式的成功读取操作。4-Kb 可配置 6T-SRAM CIM 单元宏是使用 55-nm CMOS 工艺和代工厂 6T-SRAM 单元制造的。由此产生的宏在二进制到 8-b 输入/8-b 权重精度下实现了 3.5 ns 每个周期(管道)的访问时间和 0.6-40.2 TOPS/W 的能量效率。
更新日期:2020-10-01
down
wechat
bug