Deep In-Memory Architectures in SRAM: An Analog Approach to Approximate Computing,Proceedings of the IEEE

当前位置： X-MOL 学术 › Proc. IEEE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep In-Memory Architectures in SRAM: An Analog Approach to Approximate Computing
Proceedings of the IEEE ( IF 20.6 ) Pub Date : 2020-12-01 , DOI: 10.1109/jproc.2020.3034117
Mingu Kang , Sujan K. Gonugondla , Naresh R. Shanbhag

This article provides an overview of recently proposed deep in-memory architectures (DIMAs) in SRAM for energy- and latency-efficient hardware realization of machine learning (ML) algorithms. DIMA tackles the data movement problem in von Neumann architectures head-on by deeply embedding mixed-signal computations into a conventional memory array. In doing so, it trades off its computational signal-to-noise ratio (compute SNR) with energy and latency, and therefore, it represents an analog form of approximate computing. DIMA exploits the inherent error immunity of ML algorithms and SNR budgeting methods to operate its analog circuitry in a low-swing/low-compute SNR regime, thereby achieving $> 100\times $ reduction in the energy-delay product (EDP) over an equivalent von Neumann architecture with no loss in inference accuracy. This article describes DIMA’s computational pipeline and provides a Shannon-inspired rationale for its robustness to process, temperature, and voltage variations and design guidelines to manage its analog nonidealities. DIMA’s versatility, effectiveness, and practicality demonstrated via multiple silicon IC prototypes in a 65-nm CMOS process are described. A DIMA-based instruction set architecture (ISA) to realize an end-to-end application-to-architecture mapping for the accelerating diverse ML algorithms is also presented. Finally, DIMA’s fundamental tradeoff between energy and accuracy in the low-compute SNR regime is analyzed to determine energy-optimum design parameters.

中文翻译：

SRAM 中的深度内存架构：近似计算的模拟方法

本文概述了最近在 SRAM 中提出的深度内存架构 (DIMA)，用于机器学习 (ML) 算法的节能和延迟高效硬件实现。DIMA 通过将混合信号计算深度嵌入到传统存储器阵列中，正面解决了冯诺依曼架构中的数据移动问题。这样做时，它将计算信噪比（计算 SNR）与能量和延迟进行权衡，因此，它代表了近似计算的模拟形式。DIMA 利用 ML 算法和 SNR 预算方法的固有错误免疫能力，在低摆幅/低计算 SNR 机制下运行其模拟电路，从而在一个等效的冯诺依曼架构，推理精度没有损失。本文介绍了 DIMA 的计算流程，并提供了一个受香农启发的原理，说明其对工艺、温度和电压变化的鲁棒性以及管理其模拟非理想性的设计指南。DIMA 的多功能性、有效性和实用性在 65 纳米 CMOS 工艺中通过多个硅 IC 原型得到证明。还提出了一种基于 DIMA 的指令集架构 (ISA)，用于实现端到端的应用程序到架构映射，以加速不同的 ML 算法。最后，分析了 DIMA 在低计算 SNR 范围内能量和精度之间的基本权衡，以确定能量优化设计参数。和电压变化和设计指南来管理其模拟非理想性。DIMA 的多功能性、有效性和实用性在 65 纳米 CMOS 工艺中通过多个硅 IC 原型得到证明。还提出了一种基于 DIMA 的指令集架构 (ISA)，用于实现端到端的应用程序到架构映射，以加速不同的 ML 算法。最后，分析了 DIMA 在低计算 SNR 范围内能量和精度之间的基本权衡，以确定能量优化设计参数。和电压变化和设计指南来管理其模拟非理想性。DIMA 的多功能性、有效性和实用性在 65 纳米 CMOS 工艺中通过多个硅 IC 原型得到证明。还提出了一种基于 DIMA 的指令集架构 (ISA)，用于实现端到端的应用程序到架构映射，以加速不同的 ML 算法。最后，分析了 DIMA 在低计算 SNR 范围内能量和精度之间的基本权衡，以确定能量优化设计参数。还提出了一种基于 DIMA 的指令集架构 (ISA)，用于实现端到端的应用程序到架构映射，以加速不同的 ML 算法。最后，分析了 DIMA 在低计算 SNR 范围内能量和精度之间的基本权衡，以确定能量优化设计参数。还提出了一种基于 DIMA 的指令集架构 (ISA)，用于实现端到端的应用程序到架构映射，以加速不同的 ML 算法。最后，分析了 DIMA 在低计算 SNR 范围内能量和精度之间的基本权衡，以确定能量优化设计参数。

更新日期：2020-12-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>