Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint,IEEE Transactions on Very Large Scale Integration (VLSI) Systems

当前位置： X-MOL 学术 › IEEE Trans. Very Larg. Scale Integr. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint
IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( IF 2.8 ) Pub Date : 2020-09-01 , DOI: 10.1109/tvlsi.2020.3001526
Anni Lu , Xiaochen Peng , Yandong Luo , Shimeng Yu

Compute-in-memory (CIM) is a promising computing paradigm to accelerate the inference of deep neural network (DNN) algorithms due to its high processing parallelism and energy efficiency. Prior CIM-based DNN accelerators mostly consider full custom design, which assumes that all the weights are stored on-chip. For lightweight smart edge devices, this assumption may not hold. In this article, CIM-based DNN accelerators are designed and benchmarked under different chip area constraints. First, a scheduling strategy and dataflow for DNN inference is investigated when only part of the weights can be stored on-chip. Two weight reload schemes are evaluated: 1) reload partial weights and reuse input/output feature maps and 2) load a batch of input and reuse the partial weights on-chip across the batch. Then, system-level performance benchmark is performed for the inference of ResNet-18 on ImageNet data set. The design tradeoffs with different area constraints, dataflow, and device technologies [static random access memory (SRAM) versus ferroelectric field-effect transistor (FeFET)] are discussed.

中文翻译：

具有区域约束的基于内存计算的 DNN 加速器的基准测试

内存计算 (CIM) 是一种很有前途的计算范式，由于其高处理并行性和能源效率，可以加速深度神经网络 (DNN) 算法的推理。之前基于 CIM 的 DNN 加速器大多考虑完全定制设计，假设所有权重都存储在芯片上。对于轻量级智能边缘设备，这种假设可能不成立。在本文中，基于 CIM 的 DNN 加速器在不同的芯片面积限制下进行了设计和基准测试。首先，当只有部分权重可以存储在芯片上时，研究了 DNN 推理的调度策略和数据流。评估了两种权重重新加载方案：1) 重新加载部分权重并重用输入/输出特征图；2) 加载一批输入并在整个批次中重用片上部分权重。然后，在 ImageNet 数据集上对 ResNet-18 的推理执行系统级性能基准测试。讨论了不同面积限制、数据流和器件技术 [静态随机存取存储器 (SRAM) 与铁电场效应晶体管 (FeFET)] 的设计权衡。

更新日期：2020-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>