当前位置: X-MOL 学术IEEE Trans. Elect. Dev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
High-Throughput In-Memory Computing for Binary Deep Neural Networks With Monolithically Integrated RRAM and 90-nm CMOS
IEEE Transactions on Electron Devices ( IF 2.9 ) Pub Date : 2020-10-01 , DOI: 10.1109/ted.2020.3015178
Shihui Yin , Xiaoyu Sun , Shimeng Yu , Jae-Sun Seo

Deep neural network (DNN) hardware designs have been bottlenecked by conventional memories, such as SRAM due to density, leakage, and parallel computing challenges. Resistive devices can address the density and volatility issues but have been limited by peripheral circuit integration. In this work, we present a resistive RAM (RRAM)-based in-memory computing (IMC) design, which is fabricated in 90-nm CMOS with monolithic integration of RRAM devices. We integrated a $128\times 64$ RRAM array with CMOS peripheral circuits, including row/column decoders and flash analog-to-digital converters (ADCs), which collectively become a core component for scalable RRAM-based IMC for large DNNs. To maximize IMC parallelism, we assert all 128 wordlines of the RRAM array simultaneously, perform analog computing along the bitlines, and digitize the bitline voltages using ADCs. The resistance distribution of low-resistance states is tightened by an iterative write-verify scheme. Prototype chip measurements demonstrate high binary DNN accuracy of 98.5% for MNIST and 83.5% for CIFAR-10 data sets, with 24 TOPS/W and 158 GOPS. This represents $22.3\times $ and $10.1\times $ improvements in throughput and energy–delay product (EDP), respectively, compared with the state-of-the-art literature, which can enable intelligent functionalities for area-/energy-constrained edge computing devices.

中文翻译:

具有单片集成 RRAM 和 90-nm CMOS 的二进制深度神经网络的高吞吐量内存计算

由于密度、泄漏和并行计算挑战,深度神经网络 (DNN) 硬件设计一直受到传统存储器(例如 SRAM)的限制。电阻器件可以解决密度和易失性问题,但受到外围电路集成的限制。在这项工作中,我们提出了一种基于电阻式 RAM (RRAM) 的内存计算 (IMC) 设计,该设计采用 90-nm CMOS 制造,具有 RRAM 器件的单片集成。我们集成了一个 128 美元\乘以 64 美元 带有 CMOS 外围电路的 RRAM 阵列,包括行/列解码器和闪存模数转换器 (ADC),它们共同成为大型 DNN 的基于 RRAM 的可扩展 IMC 的核心组件。为了最大限度地提高 IMC 并行性,我们同时断言 RRAM 阵列的所有 128 条字线,沿位线执行模拟计算,并使用 ADC 数字化位线电压。低电阻状态的电阻分布通过迭代写验证方案收紧。原型芯片测量表明,MNIST 的二进制 DNN 精度为 98.5%,CIFAR-10 数据集为 83.5%,具有 24 TOPS/W 和 158 GOPS。这代表 $22.3\times $ $10.1\times $ 与最先进的文献相比,吞吐量和能量延迟积 (EDP) 分别有所改进,这可以为区域/能量受限的边缘计算设备提供智能功能。
更新日期:2020-10-01
down
wechat
bug