Abstract
We introduce a new method for speeding up the inference of deep neural networks. It is somewhat inspired by the reduced-order modeling techniques for dynamical systems. The cornerstone of the proposed method is the maximum volume algorithm. We demonstrate efficiency on neural networks pre-trained on different datasets. We show that in many practical cases it is possible to replace convolutional layers with much smaller fully-connected layers with a relatively small drop in accuracy.
Similar content being viewed by others
Notes
We mean convolutional neural networks consisting of convolutions, non-decreasing activation functions, batch normalizations, maximum poolings, and residual connections.
In this paper, the given matrix is defined by C.
REFERENCES
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Proceedings of the 32nd International Conference on Neural Information Processing Systems (2018), pp. 6572–6583.
W. Grathwohl, R. T. Q. Chen, J. Bettencourt, I. Sutskever, and D. Duvenaud, “FFJORD: Free-form continuous dynamics for scalable reversible generative models,” Proceedings of the International Conference on Learning Representations (2019).
J. Gusak, L. Markeeva, T. Daulbaev, A. Katrutsa, A. Cichocki, and I. Oseledets, “Towards understanding normalization in neural ODEs,” International Conference on Learning Representations (ICLR), Workshop on Integration of Deep Neural Models and Differential Equations (2020). https://openreview.net/forum?id=mllQ3QNNr9d.
T. Daulbaev, A. Katrutsa, J. Gusak, L. Markeeva, A. Cichocki, and I. Oseledets, “Interpolation technique to speed up gradients propagation in neural ODEs,” (2020). arXiv:2003.05271.
Reduced Order Methods for Modeling and Computational Reduction, Ed. by A. Quarteroni and G. Rozza (Springer, 2014), pp. 477–512.
S. Chaturantabut and D. C Sorensen, “Nonlinear model reduction via discrete empirical interpolation,” SIAM J. Sci. Comput. 32, 2737–2764 (2010).
A. Fonarev, A. Mikhalev, P. Serdyukov, G. Gusev, and I. Oseledets, “Efficient rectangular maximal-volume algorithm for rating elicitation in collaborative filtering,” 2016 IEEE 16th International Conference on Data Mining (ICDM) (IEEE, 2016), Vol. 1, pp. 141–150.
A. Mikhalev and I. V. Oseledets, “Rectangular maximum-volume submatrices and their applications,” Linear Algebra Appl. 538, 187–211 (2018).
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778.
S. Zagoruyko and N. Komodakis, “Wide residual networks,” Proceedings of the British Machine Vision Conference (BMVC) (2016), pp. 87.1–87.12.
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4700–4708.
Z. Zhuang, M. Tan, B. Zhuang, J. Liu, Y. Guo, Q. Wu, J. Huang, and J. Zhu, “Discrimination-aware channel pruning for deep neural networks,” Adv. Neural Inf. Proc. Syst. 31, 881–892 (2018).
J. Zhong, G. Ding, Y. Guo, J. Han, and B. Wang, “Where to prune: Using LSTM to guide end-to-end pruning,” Proceedings of the 27 International Joint Conference on Artificial Intelligence (2018), pp. 3205–3211.
S. Nakajima, M. Sugiyama, S. D. Babacan, and R. Tomioka, “Global analytic solution of fully-observed variational Bayesian matrix factorization,” J. Mach. Learn. Res. 14, 1–37 (2013).
D. P. Woodruff, “Sketching as a tool for numerical linear algebra,” Found. Trends Theor. Comput. Sci. 10, 1–157 (2014).
A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras, I. Oseledets, and E. F. Müller, “FREDE: Linear-space anytime graph embeddings,” (2020). https://arxiv.org/abs/2006.04746.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” International Conference on Learning Representations (2014). https://arxiv.org/abs/1409.1556.
Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” 2017 IEEE International Conference on Computer Vision (ICCV) (2017).
X. Gao, Y. Zhao, L. Dudziak, R. Mullins, and C.-Zh. X, “Dynamic channel pruning: Feature boosting and suppression,” International Conference on Learning Representations (2019).
J. Gusak, M. Kholiavchenko, E. Ponomarev, L. Markeeva, P. Blagoveschensky, A. Cichocki, and I. Oseledets, “Automated multi-stage compression of neural networks,” IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) (2019).
C. Cui, K. Zhang, T. Daulbaev, J. Gusak, I. Oseledets, and Z. Zhang, “Active subspace of neural networks: Structural analysis and universal attacks,” arXiv:19 https://doi.org/10.13025.2019.
J.-H. Luo, H. Zhang, H. Zhou, C. Xie, J. Wu, and W. Lin, “ThiNet: Pruning CNN filters for a thinner net,” IEEE Trans. Pattern Anal. Mach. Intell. 41 (10), 2525–2538 (2018).
Y. He, X. Zhang, and J. Sun, “Channel pruning for accelerating very deep neural networks,” 2017 IEEE International Conference on Computer Vision (ICCV) (2017), pp. 1398–1406.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” (2017). https://arxiv.org/abs/1704.04861.
Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and acceleration for deep neural networks: The principles, progress, and challenges,” IEEE Signal Proc. Mag. 35 (1), 126–136 (2018).
C. Bucilua, R. Caruana, and A. Niculescu-Mizil, “Model compression,” Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006), pp. 535–541.
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” NIPS Deep Learning and Representation Learning Workshop (2015).
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “FitNets: Hints for thin deep nets,” Proceedings of the International Conference on Learning Representations (2015).
S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” Proceedings of the International Conference on Learning Representations (2017).
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” Proceedings of the International Conference on Learning Representations (2017).
H. Hu, R. Peng, Y.-W. Tai, and C.-K. Tang, “Network trimming: A data-driven neuron pruning approach towards efficient deep architectures,” (2016). https://arxiv.org/abs/1607.03250.
W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” Adv. Neural Inf. Proc. Syst. 29, 2074–2082 (2016).
E. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus, “Exploiting linear structure within convolutional networks for efficient evaluation,” Adv. Neural Inf. Proc. Syst. 2, 1269–1277 (2014).
M. Jaderberg, A. Vedaldi, and A. Zisserman, “Speeding up convolutional neural networks with low rank expansions,” Proceedings of the British Machine Vision Conference 2014 (BMVC) (2014).
V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, “Speeding-up convolutional neural networks using fine-tuned CP-decomposition,” Proceedings of the 3rd International Conference on Learning Representations (2015).
X. Zhang, J. Zou, K. He, and J. Sun, “Accelerating very deep convolutional networks for classification and detection,” IEEE Trans. Pattern Anal. Mach. Intell. 38 (10), (2016).
M. Courbariaux, Y. Bengio, and J.-P. David, “Training deep neural networks with low precision multiplications,” Proceedings of the 3rd International Conference on Learning Representations (2015).
S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, “Deep learning with limited numerical precision,” Proc. Mach. Learn. Res. 37, 1737–1746 (2015).
D. Molchanov, A. Ashukha, and D. Vetrov, “Variational dropout sparsifies deep neural networks,” Proc. Mach. Learn. Res. 70, 2498–2507 (2017).
FUNDING
This study was supported by RFBR, project nos. 19-31-90172 and 20-31-90127 (algorithm) and by the Ministry of Education and Science of the Russian Federation (grant 14.756.31.0001) (experiments).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Gusak, J., Daulbaev, T., Ponomarev, E. et al. Reduced-Order Modeling of Deep Neural Networks. Comput. Math. and Math. Phys. 61, 774–785 (2021). https://doi.org/10.1134/S0965542521050109
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0965542521050109