当前位置: X-MOL 学术Appl. Comput. Harmon. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Recurrence of optimum for training weight and activation quantized networks
Applied and Computational Harmonic Analysis ( IF 2.6 ) Pub Date : 2022-08-04 , DOI: 10.1016/j.acha.2022.07.006
Ziang Long , Penghang Yin , Jack Xin

Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms. However, training deep learning models with low-precision weights and activations involves a demanding optimization task, which calls for minimizing a stage-wise loss function subject to a discrete set-constraint. While numerous training methods have been proposed, existing studies for full quantization of DNNs are mostly empirical. From a theoretical point of view, we study practical techniques for overcoming the combinatorial nature of network quantization. Specifically, we investigate a simple yet powerful projected gradient-like algorithm for quantizing two-layer convolutional networks, by repeatedly moving one step at float weights in the negative direction of a heuristic fake gradient of the loss function (so-called coarse gradient) evaluated at quantized weights. For the first time, we prove that under mild conditions, the sequence of quantized weights recurrently visit the global optimum of the discrete minimization problem for training a fully quantized network. We also show numerical evidence of the recurrence phenomenon of weight evolution in training quantized deep networks.



中文翻译:

训练权重和激活量化网络的最优递归

对深度神经网络 (DNN) 进行量化,以在资源受限的平台上进行有效推理。然而,训练具有低精度权重和激活的深度学习模型涉及一项艰巨的优化任务,这需要最小化受离散集约束的阶段损失函数。虽然已经提出了许多训练方法,但现有的 DNN 完全量化研究大多是经验性的。从理论的角度来看,我们研究了克服网络量化组合性质的实用技术。具体来说,我们研究了一种简单但功能强大的投影类梯度算法,用于量化两层卷积网络,方法是在启发式假的负方向上重复移动一个浮点权重在量化权重处评估的损失函数的梯度(所谓的粗梯度)。我们首次证明,在温和条件下,量化权重序列反复访问离散最小化问题的全局最优值,以训练完全量化的网络。我们还展示了训练量化深度网络中权重演化重复现象的数值证据。

更新日期:2022-08-04
down
wechat
bug