当前位置: X-MOL 学术Entropy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design of a 2-Bit Neural Network Quantizer for Laplacian Source
Entropy ( IF 2.1 ) Pub Date : 2021-07-22 , DOI: 10.3390/e23080933
Zoran Perić 1 , Milan Savić 2 , Nikola Simić 3 , Bojan Denić 1 , Vladimir Despotović 4
Affiliation  

Achieving real-time inference is one of the major issues in contemporary neural network applications, as complex algorithms are frequently being deployed to mobile devices that have constrained storage and computing power. Moving from a full-precision neural network model to a lower representation by applying quantization techniques is a popular approach to facilitate this issue. Here, we analyze in detail and design a 2-bit uniform quantization model for Laplacian source due to its significance in terms of implementation simplicity, which further leads to a shorter processing time and faster inference. The results show that it is possible to achieve high classification accuracy (more than 96% in the case of MLP and more than 98% in the case of CNN) by implementing the proposed model, which is competitive to the performance of the other quantization solutions with almost optimal precision.

中文翻译:

拉普拉斯源的2位神经网络量化器设计

实现实时推理是当代神经网络应用的主要问题之一,因为复杂的算法经常被部署到存储和计算能力受限的移动设备上。通过应用量化技术从全精度神经网络模型转移到较低的表示是解决这个问题的一种流行方法。在这里,我们详细分析并设计了一个用于拉普拉斯源的 2 位均匀量化模型,因为它在实现简单性方面具有重要意义,这进一步导致更短的处理时间和更快的推理。结果表明,通过实施所提出的模型,可以实现高分类准确率(MLP 情况下超过 96%,CNN 情况下超过 98%),
更新日期:2021-07-22
down
wechat
bug