当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the Accuracy of Analog Neural Network Inference Accelerators
arXiv - CS - Hardware Architecture Pub Date : 2021-09-03 , DOI: arxiv-2109.01262
T. Patrick Xiao, Ben Feinberg, Christopher H. Bennett, Venkatraman Prabhakar, Prashant Saxena, Vineet Agrawal, Sapan Agarwal, Matthew J. Marinella

Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in mapping neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.

中文翻译:

关于模拟神经网络推理加速器的准确性

专用加速器最近作为一种降低神经网络推理功耗的方法引起了人们的关注。一类有前途的加速器利用非易失性存储器阵列来存储权重并在阵列内执行 $\textit{in situ}$ 模拟计算。虽然之前的工作探索了模拟加速器的设计空间以优化性能和能源效率,但很少对这些加速器的准确性进行严格评估。这项工作展示了架构设计决策,特别是在将神经网络参数映射到模拟存储单元方面,如何影响推理准确性。在 ImageNet 上使用 ResNet50 进行评估时,系统对模拟非理想情况的弹性 - 单元编程错误、模数转换器分辨率、和阵列寄生电阻——当硬件中的模拟量与网络中的权重成正比时,所有这些都会得到改善。此外,与先前工作的假设相反,通过将权重完全存储为模拟量,而不是将权重位分散到多个设备(通常称为位切片),可以实现对单元不精确性的几乎等效弹性。通过利用比例性,模拟系统设计人员可以自由地将硬件的精度与算法的需求相匹配,而不是试图保证中间结果的精度与等效的数字加速器相同。这最终导致模拟加速器更准确、对模拟错误更稳健且更节能。
更新日期:2021-09-06
down
wechat
bug