当前位置: X-MOL 学术arXiv.cs.ET › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mixed-precision deep learning based on computational memory
arXiv - CS - Emerging Technologies Pub Date : 2020-01-31 , DOI: arxiv-2001.11773
S. R. Nandakumar, Manuel Le Gallo, Christophe Piveteau, Vinay Joshi, Giovanni Mariani, Irem Boybat, Geethan Karunaratne, Riduan Khaddam-Aljameh, Urs Egger, Anastasios Petropoulos, Theodore Antonakopoulos, Bipin Rajendran, Abu Sebastian, Evangelos Eleftheriou

Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner. However, updating the conductance states in a reliable manner during the weight update process is a fundamental challenge that limits the training accuracy of such an implementation. Here, we propose a mixed-precision architecture that combines a computational memory unit performing the weighted summations and imprecise conductance updates with a digital processing unit that accumulates the weight updates in high precision. A combined hardware/software training experiment of a multilayer perceptron based on the proposed architecture using a phase-change memory (PCM) array achieves 97.73% test accuracy on the task of classifying handwritten digits (based on the MNIST dataset), within 0.6% of the software baseline. The architecture is further evaluated using accurate behavioral models of PCM on a wide class of networks, namely convolutional neural networks, long-short-term-memory networks, and generative-adversarial networks. Accuracies comparable to those of floating-point implementations are achieved without being constrained by the non-idealities associated with the PCM devices. A system-level study demonstrates 173x improvement in energy efficiency of the architecture when used for training a multilayer perceptron compared with a dedicated fully digital 32-bit implementation.

中文翻译:

基于计算记忆的混合精度深度学习

深度神经网络 (DNN) 彻底改变了人工智能领域,并在图像和语音识别等认知任务中取得了前所未有的成功。然而,大型 DNN 的训练是计算密集型的,这促使寻找针对该应用程序的新型计算架构。具有交叉阵列中组织的纳米级电阻存储设备的计算存储单元可以将突触权重存储在其电导状态,并以非冯诺依曼方式执行昂贵的加权求和。然而,在权重更新过程中以可靠的方式更新电导状态是一个基本挑战,限制了这种实现的训练准确性。这里,我们提出了一种混合精度架构,该架构将执行加权求和和不精确电导更新的计算存储单元与以高精度累积权重更新的数字处理单元相结合。使用相变存储器 (PCM) 阵列基于所提出的架构的多层感知器的组合硬件/软件训练实验在手写数字分类任务(基于 MNIST 数据集)上达到 97.73% 的测试准确率,在 0.6% 以内软件基线。使用 PCM 的准确行为模型在广泛的网络类别(即卷积神经网络、长短期记忆网络和生成对抗网络)上进一步评估该架构。精度可与浮点实现相媲美,而不受与 PCM 设备相关的非理想性的限制。一项系统级研究表明,与专用的全数字 32 位实现相比,当用于训练多层感知器时,该架构的能效提高了 173 倍。
更新日期:2020-05-13
down
wechat
bug