Training high-performance and large-scale deep neural networks with full 8-bit integers.,Neural Networks

当前位置： X-MOL 学术 › Neural Netw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Training high-performance and large-scale deep neural networks with full 8-bit integers.
Neural Networks ( IF 6.0 ) Pub Date : 2020-01-15 , DOI: 10.1016/j.neunet.2019.12.027
Yukuan Yang ₁ , Lei Deng ₂ , Shuang Wu ₁ , Tianyi Yan ₃ , Yuan Xie ₂ , Guoqi Li ₁

Affiliation

Deep neural network (DNN) quantization converting floating-point (FP) data in the network to integers (INT) is an effective way to shrink the model size for memory saving and simplify the operations for compute acceleration. Recently, researches on DNN quantization develop from inference to training, laying a foundation for the online training on accelerators. However, existing schemes leaving batch normalization (BN) untouched during training are mostly incomplete quantization that still adopts high precision FP in some parts of the data paths. Currently, there is no solution that can use only low bit-width INT data during the whole training process of large-scale DNNs with acceptable accuracy. In this work, through decomposing all the computation steps in DNNs and fusing three special quantization functions to satisfy the different precision requirements, we propose a unified complete quantization framework termed as "WAGEUBN" to quantize DNNs involving all data paths including W (Weights), A (Activation), G (Gradient), E (Error), U (Update), and BN. Moreover, the Momentum optimizer is also quantized to realize a completely quantized framework. Experiments on ResNet18/34/50 models demonstrate that WAGEUBN can achieve competitive accuracy on the ImageNet dataset. For the first time, the study of quantization in large-scale DNNs is advanced to the full 8-bit INT level. In this way, all the operations in the training and inference can be bit-wise operations, pushing towards faster processing speed, decreased memory cost, and higher energy efficiency. Our throughout quantization framework has great potential for future efficient portable devices with online learning ability.

中文翻译：

用完整的8位整数训练高性能和大规模的深度神经网络。

深度神经网络（DNN）量化将网络中的浮点（FP）数据转换为整数（INT）是缩小模型大小以节省内存并简化计算加速操作的有效方法。近年来，对DNN量化的研究已从推理发展到训练，为加速器的在线训练奠定了基础。但是，现有的在训练过程中不影响批量归一化（BN）的方案大多是不完全的量化，在数据路径的某些部分仍采用高精度FP。当前，没有任何解决方案可以在大规模DNN的整个训练过程中仅使用低位宽INT数据，并且具有可接受的精度。在这项工作中通过分解DNN中的所有计算步骤并融合三个特殊的量化函数以满足不同的精度要求，我们提出了一个统一的完整量化框架，称为“ WAGEUBN”，以量化涉及所有数据路径的DNN，包括W（Weights），A（Activation），G（渐变），E（错误），U（更新）和BN。而且，动量优化器也被量化以实现完全量化的框架。在ResNet18 / 34/50模型上进行的实验表明，WAGEUBN可以在ImageNet数据集上获得具有竞争力的准确性。大规模DNN中的量化研究第一次提升到完整的8位INT级别。这样，训练和推论中的所有操作都可以按位进行，从而加快了处理速度，降低了内存成本，以及更高的能源效率。我们的整个量化框架对于具有在线学习能力的未来高效便携式设备具有巨大的潜力。

更新日期：2020-01-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11