当前位置: X-MOL 学术IEEE Micro › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Monolithically Integrated RRAM- and CMOS-Based In-Memory Computing Optimizations for Efficient Deep Learning
IEEE Micro ( IF 3.6 ) Pub Date : 2019-11-01 , DOI: 10.1109/mm.2019.2943047
Shihui Yin 1 , Yulhwa Kim 2 , Xu Han 1 , Hugh Barnaby 1 , Shimeng Yu 3 , Yandong Luo 3 , Wangxin He 1 , Xiaoyu Sun 3 , Jae-Joon Kim 2 , Jae-sun Seo 1
Affiliation  

Resistive RAM (RRAM) has been presented as a promising memory technology toward deep neural network (DNN) hardware design, with nonvolatility, high density, high ON/OFF ratio, and compatibility with logic process. However, prior RRAM works for DNNs have shown limitations on parallelism for in-memory computing, array efficiency with large peripheral circuits, multilevel analog operation, and demonstration of monolithic integration. In this article, we propose circuit-/device-level optimizations to improve the energy and density of RRAM-based in-memory computing architectures. We report experimental results based on prototype chip design of 128 × 64 RRAM arrays and CMOS peripheral circuits, where RRAM devices are monolithically integrated in a commercial 90-nm CMOS technology. We demonstrate the CMOS peripheral circuit optimization using input-splitting scheme and investigate the implication of higher low resistance state on energy efficiency and robustness. Employing the proposed techniques, we demonstrate RRAM-based in-memory computing with up to 116.0 TOPS/W energy efficiency and 84.2% CIFAR-10 accuracy. Furthermore, we investigate four-level programming with single RRAM device, and report the system-level performance and DNN accuracy results using circuit-level benchmark simulator NeuroSim.

中文翻译:

用于高效深度学习的基于单片集成 RRAM 和 CMOS 的内存计算优化

电阻式 RAM (RRAM) 已被认为是一种面向深度神经网络 (DNN) 硬件设计的有前途的存储器技术,具有非易失性、高密度、高 ON/OFF 比以及与逻辑过程的兼容性。然而,先前用于 DNN 的 RRAM 工作在内存计算的并行性、大型外围电路的阵列效率、多级模拟操作和单片集成演示方面存在局限性。在本文中,我们提出了电路/设备级优化,以提高基于 RRAM 的内存计算架构的能量和密度。我们报告了基于 128 × 64 RRAM 阵列和 CMOS 外围电路的原型芯片设计的实验结果,其中 RRAM 器件单片集成在商业 90-nm CMOS 技术中。我们展示了使用输入分离方案的 CMOS 外围电路优化,并研究了更高的低电阻状态对能源效率和鲁棒性的影响。使用所提出的技术,我们展示了基于 RRAM 的内存计算,具有高达 116.0 TOPS/W 的能效和 84.2% 的 CIFAR-10 精度。此外,我们研究了使用单个 RRAM 设备的四级编程,并使用电路级基准模拟器 NeuroSim 报告系统级性能和 DNN 精度结果。
更新日期:2019-11-01
down
wechat
bug