Nonlinear tensor train format for deep neural network compression,Neural Networks

当前位置： X-MOL 学术 › Neural Netw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Nonlinear tensor train format for deep neural network compression
Neural Networks ( IF 6.0 ) Pub Date : 2021-09-08 , DOI: 10.1016/j.neunet.2021.08.028
Dingheng Wang ₁ , Guangshe Zhao ₁ , Hengnu Chen ₂ , Zhexian Liu ₂ , Lei Deng ₂ , Guoqi Li ₂

Affiliation

Deep neural network (DNN) compression has become a hot topic in the research of deep learning since the scale of modern DNNs turns into too huge to implement on practical resource constrained platforms such as embedded devices. Among variant compression methods, tensor decomposition appears to be a relatively simple and efficient strategy owing to its solid mathematical foundations and regular data structure. Generally, tensorizing neural weights into higher-order tensors for better decomposition, and directly mapping efficient tensor structure to neural architecture with nonlinear activation functions, are the two most common ways. However, the considerable accuracy loss is still a fly in the ointment for the tensorizing way especially for convolutional neural networks (CNNs), while the number of studies in the mapping way is comparatively limited and corresponding compression ratio appears to be not considerable. Therefore, in this work, by researching multiple types of tensor decompositions, we realize that tensor train (TT), which has specific and efficient sequenced contractions, is potential to take into account both of tensorizing and mapping ways. Then we propose a novel nonlinear tensor train (NTT) format, which contains extra nonlinear activation functions embedded in sequenced contractions and convolutions on the top of the normal TT decomposition and the proposed TT format connected by convolutions, to compensate the accuracy loss that normal TT cannot give. Further than just shrinking the space complexity of original weight matrices and convolutional kernels, we prove that NTT can afford an efficient inference time as well. Extensive experiments and discussions demonstrate that the compressed DNNs in our NTT format can almost maintain the accuracy at least on MNIST, UCF11 and CIFAR-10 datasets, and the accuracy loss caused by normal TT could be compensated significantly on large-scale datasets such as ImageNet.

中文翻译：

用于深度神经网络压缩的非线性张量训练格式

由于现代 DNN 的规模变得太大而无法在嵌入式设备等实际资源受限平台上实施，因此深度神经网络 (DNN) 压缩已成为深度学习研究的热门话题。在变体压缩方法中，张量分解由于其扎实的数学基础和规则的数据结构，似乎是一种相对简单有效的策略。通常，将神经权重张量成更高阶的张量以更好地分解，以及将有效的张量结构直接映射到具有非线性激活函数的神经架构，是两种最常见的方式。然而，相当大的精度损失仍然是张量化方式的美中不足，特别是对于卷积神经网络（CNN），而映射方式的研究数量相对有限，相应的压缩率似乎并不可观。因此，在这项工作中，通过研究多种类型的张量分解，我们意识到具有特定和有效序列收缩的张量训练（TT）有可能同时考虑张量化和映射方式。然后我们提出了一种新的非线性张量训练 (NTT) 格式，其中包含嵌入在正常 TT 分解顶部的有序收缩和卷积中的额外非线性激活函数，以及通过卷积连接的建议 TT 格式，以补偿正常 TT 的精度损失不能给。不仅仅是缩小原始权重矩阵和卷积核的空间复杂度，我们证明 NTT 也可以提供有效的推理时间。大量的实验和讨论表明，我们的 NTT 格式的压缩 DNN 至少在 MNIST、UCF11 和 CIFAR-10 数据集上几乎可以保持准确度，并且可以在 ImageNet 等大规模数据集上显着补偿正常 TT 造成的准确度损失.

更新日期：2021-09-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11