Abstract
To reduce random access memory (RAM) requirements and to increase speed of recognition algorithms we consider a weight discretization problem for trained neural networks. We show that an exponential discretization is preferable to a linear discretization since it allows one to achieve the same accuracy when the number of bits is 1 or 2 less. The quality of the neural network VGG-16 is already satisfactory (top5 accuracy 69%) in the case of 3 bit exponential discretization. The ResNet50 neural network shows top5 accuracy 84% at 4 bits. Other neural networks perform fairly well at 5 bits (top5 accuracies of Xception, Inception-v3, and MobileNet-v2 top5 were 87%, 90%, and 77%, respectively). At less number of bits, the accuracy decreases rapidly.
Similar content being viewed by others
REFERENCES
Simonyan, K. and Zisserman, A., Very Deep Convolutional Networks for Large-Scale Image Recognition. https://arxiv.org/abs/1409.1556.
ImageNet—huge image dataset. http://www.image-net.org.
Kryzhanovsky, B.V., Kryzhanovsky, M.V. and Malsagov, M.Y., Discretization of a matrix in quadratic functional binary optimization, Dokl. Math., 2011, vol. 83, p. 413. https://doi.org/10.1134/S1064562411030197
Kryzhanovsky, M.V. and Malsagov, M.Yu., Modification of binary optimization algorithm and use small digit capacity numbers, Opt. Mem. Neural Networks, 2011, vol. 20, no. 3, pp. 194–200. https://doi.org/10.3103/S1060992X11030052
Kryzhanovskii, B.V., Kryzhanovskii, V.M., and Mikaelyan, A.L., Application of the clipping procedure to the binary minimization of a quadratic functional, Dokl. Math., 2007, vol. 75, no. 2, pp. 310–313.
Kryzhanovsky, B.V. and Kryzhanovsky, V.M. An accelerated procedure for solving binary optimization problems, J. Comput. Syst. Sci. Int., 2009, vol. 48, p. 732. https://doi.org/10.1134/S1064230709050074
Chenzhuo Zhu, Song Han, Huizi Mao, and Dally, W.J., Trained Ternary Quantization. https://arxiv.org/abs/1612.01064.
Shuchang Zhou, Zekun Ni, Xinyu Zhou, He Wen, Yuxin Wu, and Yuheng Zou, Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. https://arxiv.org/pdf/1606.06160.pdf.
Song Han, Huizi Mao, and Dally, W.J., Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding, arXiv:1510.00149 [cs.CV], 2015.
Jingyong Cai, Masashi Takemoto, and Hironori Nakajo, A deep look into of model parameters in neural networks, The 10th International Conference on Advances in Information Technology (IAIT2018), December 10–13,2018, Bangkok, Thailand, New York, 2018. https://doi.org/10.1145/3291280.3291800.
Models for image classification with weights trained on ImageNet. https://keras.io/applications/.
Funding
The work was financially supported by State Program of SRISA RAS No. 0065-2019-0003 (AAA-A19-119011590090-2).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
About this article
Cite this article
Malsagov, M.Y., Khayrov, E.M., Pushkareva, M.M. et al. Exponential Discretization of Weights of Neural Network Connections in Pre-Trained Neural Networks. Opt. Mem. Neural Networks 28, 262–270 (2019). https://doi.org/10.3103/S1060992X19040106
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X19040106