当前位置: X-MOL 学术ACM J. Emerg. Technol. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantization of Deep Neural Networks for Accurate Edge Computing
ACM Journal on Emerging Technologies in Computing Systems ( IF 2.1 ) Pub Date : 2021-06-30 , DOI: 10.1145/3451211
Wentao Chen 1 , Hailong Qiu 1 , Jian Zhuang 1 , Chutong Zhang 2 , Yu Hu 2 , Qing Lu 3 , Tianchen Wang 3 , Yiyu Shi 3 , Meiping Huang 1 , Xiaowe Xu 1
Affiliation  

Deep neural networks have demonstrated their great potential in recent years, exceeding the performance of human experts in a wide range of applications. Due to their large sizes, however, compression techniques such as weight quantization and pruning are usually applied before they can be accommodated on the edge. It is generally believed that quantization leads to performance degradation, and plenty of existing works have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue that quantization, which essentially imposes regularization on weight representations, can sometimes help to improve accuracy. We conduct comprehensive experiments on three widely used applications: fully connected network for biomedical image segmentation, convolutional neural network for image classification on ImageNet, and recurrent neural network for automatic speech recognition, and experimental results show that quantization can improve the accuracy by 1%, 1.95%, 4.23% on the three applications respectively with 3.5x-6.4x memory reduction.

中文翻译:

用于精确边缘计算的深度神经网络量化

近年来,深度神经网络已经展示了其巨大的潜力,在广泛的应用中超过了人类专家的表现。然而,由于它们的尺寸很大,通常在它们可以容纳在边缘之前应用诸如权重量化和剪枝之类的压缩技术。人们普遍认为量化会导致性能下降,现有的大量工作已经探索了旨在最小化精度损失的量化策略。在本文中,我们认为量化,本质上是对权重表示施加正则化,有时可以帮助提高准确性。我们对三个广泛使用的应用进行了综合实验:用于生物医学图像分割的全连接网络、用于 ImageNet 上图像分类的卷积神经网络、
更新日期:2021-06-30
down
wechat
bug