当前位置: X-MOL 学术IEEE Micro › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ReLeQ: A Reinforcement Learning Approach for Automatic Deep Quantization of Neural Networks
IEEE Micro ( IF 2.8 ) Pub Date : 2020-09-01 , DOI: 10.1109/mm.2020.3009475
Ahmed T Elthakeb 1 , Prannoy Pilligundla 1 , Fatemehsadat Mireshghallah 1 , Hadi Esmaeilzadeh 1 , Amir Yazdanbakhsh 2
Affiliation  

Deep Quantization (below eight bits) can significantly reduce the DNN computation and storage by decreasing the bitwidth of network encodings. However, without arduous manual effort, this deep quantization can lead to significant accuracy loss, leaving it in a position of questionable utility. We propose a systematic approach to tackle this problem, by automating the process of discovering the bitwidths through an end-to-end deep reinforcement learning framework (ReLeQ). This framework utilizes the sample efficiency of proximal policy optimization to explore the exponentially large space of possible assignment of the bitwidths to the layers. We show how ReLeQ can balance speed and quality, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks with minimal accuracy loss ($\leq$ ≤ 0.3% loss) while minimizing the computation and storage costs. With these DNNs, ReLeQ enables conventional hardware and custom DNN accelerator to achieve $~2.2\times$ 2 . 2 × speedup over 8-bit execution.

中文翻译:


ReLeQ:一种用于神经网络自动深度量化的强化学习方法



深度量化(低于八位)可以通过减少网络编码的位宽来显着减少 DNN 计算和存储。然而,如果没有艰苦的手动工作,这种深度量化可能会导致严重的精度损失,使其实用性受到质疑。我们提出了一种系统方法来解决这个问题,通过端到端深度强化学习框架(ReLeQ)自动化发现位宽的过程。该框架利用近端策略优化的样本效率来探索可能的层位宽度分配的指数级大空间。我们展示了 ReLeQ 如何平衡速度和质量,并为各种深度网络的量化提供异构位宽分配,以最小的精度损失($\leq$ ≤ 0.3% 损失),同时最小化计算和存储成本。借助这些 DNN,ReLeQ 使传统硬件和定制 DNN 加速器能够实现 $~2.2\times$ 2 。比 8 位执行速度提高 2 倍。
更新日期:2020-09-01
down
wechat
bug