当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Student Networks via Feature Embedding.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-01-04 , DOI: 10.1109/tnnls.2020.2970494
Hanting Chen , Yunhe Wang , Chang Xu , Chao Xu , Dacheng Tao

Deep convolutional neural networks have been widely used in numerous applications, but their demanding storage and computational resource requirements prevent their applications on mobile devices. Knowledge distillation aims to optimize a portable student network by taking the knowledge from a well-trained heavy teacher network. Traditional teacher-student-based methods used to rely on additional fully connected layers to bridge intermediate layers of teacher and student networks, which brings in a large number of auxiliary parameters. In contrast, this article aims to propagate information from teacher to student without introducing new variables that need to be optimized. We regard the teacher-student paradigm from a new perspective of feature embedding. By introducing the locality preserving loss, the student network is encouraged to generate the low-dimensional features that could inherit intrinsic properties of their corresponding high-dimensional features from the teacher network. The resulting portable network, thus, can naturally maintain the performance as that of the teacher network. Theoretical analysis is provided to justify the lower computation complexity of the proposed method. Experiments on benchmark data sets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.

中文翻译:

通过特征嵌入学习学生网络。

深度卷积神经网络已在众多应用中得到广泛应用,但它们对存储和计算资源的苛刻要求阻碍了它们在移动设备上的应用。知识蒸馏旨在通过从训练有素的重型教师网络中获取知识来优化可移植的学生网络。传统的基于师生的方法过去依赖额外的全连接层来桥接教师和学生网络的中间层,这带来了大量的辅助参数。相比之下,本文旨在将信息从教师传播到学生,而不引入需要优化的新变量。我们从特征嵌入的新角度看待师生范式。通过引入局部保持损失,鼓励学生网络生成低维特征,这些特征可以从教师网络继承其相应高维特征的内在属性。因此,由此产生的便携式网络可以自然地保持教师网络的性能。提供理论分析以证明所提出方法的较低计算复杂度。在基准数据集和训练有素的网络上的实验表明,所提出的算法在计算和存储复杂度方面优于最先进的师生学习方法。可以自然地保持与教师网络一样的性能。提供理论分析以证明所提出方法的较低计算复杂度。在基准数据集和训练有素的网络上的实验表明,所提出的算法在计算和存储复杂度方面优于最先进的师生学习方法。可以自然地保持与教师网络一样的性能。提供理论分析以证明所提出方法的较低计算复杂度。在基准数据集和训练有素的网络上的实验表明,所提出的算法在计算和存储复杂度方面优于最先进的师生学习方法。
更新日期:2020-02-24
down
wechat
bug