当前位置: X-MOL 学术IEEE Trans. Circuit Syst. II Express Briefs › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A 43.1TOPS/W Energy-Efficient Absolute-Difference-Accumulation Operation Computing-In-Memory With Computation Reuse
IEEE Transactions on Circuits and Systems II: Express Briefs ( IF 4.0 ) Pub Date : 2021-03-19 , DOI: 10.1109/tcsii.2021.3067327
Soyeon Um , Sangyeob Kim , Sangjin Kim , Hoi-Jun Yoo

Recently, Computing-In-Memory (CIM) processors have been proposed to achieve high energy-efficiency by reducing data movement and solving memory bottlenecks. Furthermore, a network with high accurate image classification has been introduced through the Absolute-Difference-Accumulation (ADA) operation instead of the multiplication-and-accumulation operation, which is widely used in DNN. ADA operation provides not only opportunities for high energy-efficient DNN accelerating by reducing multiplication but also a chance to reuse computation results. However, the previous CIM processor cannot reuse previous computation results for other computations. In this brief, we propose a high accurate and high energy-efficient ADA-CIM processor that with two key features: 1) computation reuse for low-power, resulting in a 49.5% CIM operation power reduction, and 2) low-cost sign prediction core with 3-bit activation and weight quantization for high utilization. From the two key features, the proposed ADA-CIM processor is simulated in 28 nm CMOS technology and occupies 3.78 mm 2 . It consumes 2.77mW and achieves 43.1 TOPS/W energy-efficiency with a high-accuracy of 91.62% at CIFAR-10 (ResNet-20).

中文翻译:

具有计算重用功能的43.1TOPS / W高效节能绝对差累加运算在内存中进行计算

近来,已提出了内存中计算(CIM)处理器以通过减少数据移动并解决内存瓶颈来实现高能效。此外,已经通过绝对差分累加(ADA)操作代替了DNN中广泛使用的乘加累加操作,引入了具有高精度图像分类的网络。ADA操作不仅提供了通过减少乘法来加速高能效DNN的机会,而且还提供了重用计算结果的机会。但是,先前的CIM处理器无法将先前的计算结果重新用于其他计算。在本简介中,我们提出了一种高精度,高能效的ADA-CIM处理器,该处理器具有两个关键特性:1)低功耗的计算重用,从而使CIM的工作功耗降低了49.5%,2)具有3位激活和权重量化功能的低成本符号预测核心,可实现高利用率。从这两个关键特性来看,拟议的ADA-CIM处理器采用28 nm CMOS技术进行仿真,占用面积为3.78 mm 2 。在CIFAR-10(ResNet-20)上,它的功耗为2.77mW,并达到43.1 TOPS / W的能量效率,且具有91.62%的高精度。
更新日期:2021-05-04
down
wechat
bug