当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Bitwidth Search for Practical Mixed Precision Neural Network
arXiv - CS - Machine Learning Pub Date : 2020-03-17 , DOI: arxiv-2003.07577
Yuhang Li, Wei Wang, Haoli Bai, Ruihao Gong, Xin Dong, and Fengwei Yu

Network quantization has rapidly become one of the most widely used methods to compress and accelerate deep neural networks. Recent efforts propose to quantize weights and activations from different layers with different precision to improve the overall performance. However, it is challenging to find the optimal bitwidth (i.e., precision) for weights and activations of each layer efficiently. Meanwhile, it is yet unclear how to perform convolution for weights and activations of different precision efficiently on generic hardware platforms. To resolve these two issues, in this paper, we first propose an Efficient Bitwidth Search (EBS) algorithm, which reuses the meta weights for different quantization bitwidth and thus the strength for each candidate precision can be optimized directly w.r.t the objective without superfluous copies, reducing both the memory and computational cost significantly. Second, we propose a binary decomposition algorithm that converts weights and activations of different precision into binary matrices to make the mixed precision convolution efficient and practical. Experiment results on CIFAR10 and ImageNet datasets demonstrate our mixed precision QNN outperforms the handcrafted uniform bitwidth counterparts and other mixed precision techniques.

中文翻译:

实用混合精密神经网络的有效位宽搜索

网络量化已迅速成为压缩和加速深度神经网络的最广泛使用的方法之一。最近的努力建议以不同的精度量化来自不同层的权重和激活,以提高整体性能。然而,有效地找到每一层的权重和激活的最佳位宽(即精度)是具有挑战性的。同时,目前还不清楚如何在通用硬件平台上有效地对不同精度的权重和激活进行卷积。为了解决这两个问题,在本文中,我们首先提出了一种高效位宽搜索(EBS)算法,该算法重用不同量化位宽的元权重,因此可以直接优化每个候选精度的强度,而无需多余的副本,显着降低内存和计算成本。其次,我们提出了一种二进制分解算法,将不同精度的权重和激活转换为二进制矩阵,使混合精度卷积高效实用。在 CIFAR10 和 ImageNet 数据集上的实验结果表明,我们的混合精度 QNN 优于手工制作的均匀位宽对应物和其他混合精度技术。
更新日期:2020-03-18
down
wechat
bug