当前位置: X-MOL 学术Front. Comput. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating Inference of Convolutional Neural Networks Using In-memory Computing
Frontiers in Computational Neuroscience ( IF 3.2 ) Pub Date : 2023-01-09 , DOI: 10.3389/fncom.2021.674154
Martino Dazzi 1, 2 , Abu Sebastian 1 , Luca Benini 2 , Evangelos Eleftheriou 1
Affiliation  

In-memory computing (IMC) is a non-von Neumann paradigm that has recently established itself as a promising approach for energy-efficient, high throughput hardware for deep learning applications. One prominent application of IMC is that of performing matrix-vector multiplication in O(1) time complexity by mapping the synaptic weights of a neural-network layer to the devices of an IMC core. However, because of the significantly different pattern of execution compared to previous computational paradigms, IMC requires a rethinking of the architectural design choices made when designing deep-learning hardware. In this work, we focus on application-specific, IMC hardware for inference of Convolution Neural Networks (CNNs), and provide methodologies for implementing the various architectural components of the IMC core. Specifically, we present methods for mapping synaptic weights and activations on the memory structures and give evidence of the various trade-offs therein, such as the one between on-chip memory requirements and execution latency. Lastly, we show how to employ these methods to implement a pipelined dataflow that offers throughput and latency beyond state-of-the-art for image classification tasks.

中文翻译:

使用内存计算加速卷积神经网络的推理

内存计算 (IMC) 是一种非冯·诺依曼范式,最近已成为深度学习应用的节能、高吞吐量硬件的一种有前途的方法。IMC 的一个突出应用是执行矩阵向量乘法(1个)通过将神经网络层的突触权重映射到 IMC 核心的设备来降低时间复杂度。然而,由于与以前的计算范式相比执行模式明显不同,IMC 需要重新考虑在设计深度学习硬件时所做的架构设计选择。在这项工作中,我们专注于用于推理卷积神经网络 (CNN) 的特定应用程序 IMC 硬件,并提​​供用于实现 IMC 核心的各种架构组件的方法。具体来说,我们提出了在内存结构上映射突触权重和激活的方法,并给出了其中各种权衡的证据,例如片上内存要求和执行延迟之间的权衡。最后,
更新日期:2023-01-09
down
wechat
bug