当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fusion of memristor and digital compute-in-memory processing for energy-efficient edge computing
Science ( IF 56.9 ) Pub Date : 2024-04-18 , DOI: 10.1126/science.adf5538
Tai-Hao Wen, Je-Min Hung, Wei-Hsing Huang, Chuan-Jia Jhang, Yun-Chen Lo, Hung-Hsi Hsu, Zhao-En Ke, Yu-Chiao Chen, Yu-Hsiang Chin, Chin-I Su, Win-San Khwa, Chung-Chuan Lo, Ren-Shuo Liu, Chih-Cheng Hsieh, Kea-Tiong Tang, Mon-Shu Ho, Chung-Cheng Chou, Yu-Der Chih, Tsung-Yung Jonathan Chang, Meng-Fan Chang

Artificial intelligence (AI) edge devices prefer employing high-capacity nonvolatile compute-in-memory (CIM) to achieve high energy efficiency and rapid wakeup-to-response with sufficient accuracy. Most previous works are based on either memristor-based CIMs, which suffer from accuracy loss and do not support training as a result of limited endurance, or digital static random-access memory (SRAM)–based CIMs, which suffer from large area requirements and volatile storage. We report an AI edge processor that uses a memristor-SRAM CIM-fusion scheme to simultaneously exploit the high accuracy of the digital SRAM CIM and the high energy-efficiency and storage density of the resistive random-access memory memristor CIM. This also enables adaptive local training to accommodate personalized characterization and user environment. The fusion processor achieved high CIM capacity, short wakeup-to-response latency (392 microseconds), high peak energy efficiency (77.64 teraoperations per second per watt), and robust accuracy (<0.5% accuracy loss). This work demonstrates that memristor technology has moved beyond in-lab development stages and now has manufacturability for AI edge processors.

中文翻译:

忆阻器与数字内存计算处理的融合,实现节能边缘计算

人工智能 (AI) 边缘设备更喜欢采用大容量非易失性内存计算 (CIM) 来实现高能效和足够准确的快速唤醒响应。之前的大多数工作都是基于基于忆阻器的 CIM(由于耐用性有限,因此存在精度损失且不支持训练)或基于数字静态随机存取存储器 (SRAM) 的 CIM,该 CIM 受到大面积要求和易失性存储。我们报告了一种人工智能边缘处理器,它使用忆阻器-SRAM CIM 融合方案,同时利用数字 SRAM CIM 的高精度和电阻式随机存取存储器忆阻器 CIM 的高能效和存储密度。这还使得自适应本地训练能够适应个性化特征和用户环境。该融合处理器实现了高 CIM 容量、短唤醒响应延迟(392 微秒)、高峰值能效(每秒每瓦 77.64 万亿次操作)和强大的精度(<0.5% 精度损失)。这项工作表明忆阻器技术已经超越了实验室开发阶段,现在具有人工智能边缘处理器的可制造性。
更新日期:2024-04-18
down
wechat
bug