A Novel CNN Training Framework: Loss Transferring,IEEE Transactions on Circuits and Systems for Video Technology

当前位置： X-MOL 学术 › IEEE Trans. Circ. Syst. Video Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Novel CNN Training Framework: Loss Transferring
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 2020-04-21 , DOI: 10.1109/tcsvt.2020.2989308
Cong Liang , Haixia Zhang , Dongfeng Yuan , Minggao Zhang

As one of the indispensable components in convolutional neural network (CNN), loss function assists in updating parameters of CNN models during the training phase. Generally, different loss functions can assist convolutional neural network (CNN) to learn different feature representations, and different feature representations can be treated as different knowledge learned from objects. In this paper we introduce a novel training framework, namely Loss Transferring (LT), to improve the generalization ability of CNN. LT contains multiple training phases, and each training phase uses a different loss function. Under this framework, CNN models can combine different knowledge of objects by transferring the knowledge learned via one loss function to another. LT contains two components, i.e., loss function set and training strategy. In order to build appropriate loss function set, we establish two basic guides. And according to these basic guides, we design a new loss function in the last layer of CNN models (layer before softmax operation), namely Near Classifier Hyper-Plane (N-CHP) loss, which makes the learned object features belonging to the same category have the minimum intra-class distance and be near the classifier hyper-plane. Based on the two loss function set (MSE, softmaxl and (N-CHP, softmaxl, we setup two specific training methods, LTMSE, softmax and LTN-CHP, softmax, which can be universally applied to different CNN models with low additional computation cost. Meanwhile, two training strategies, multi-phase strategy 1 and multi-phase strategy 2, are further proposed to improve the training efficiency of LT. Extensive experimental results on shallow, moderate and deep models with four benchmark datasets, including MNIST, SVHN, CIFAR-10 and CIFAR-100, demonstrate that CNN models can bring obvious performance improvements when working with LTMSE, softmax and LTN-CHP, softmax, which verifies the effectiveness of LT and the proposed two basic guides.

中文翻译：

一种新颖的 CNN 训练框架：损失转移

作为卷积神经网络（CNN）中不可或缺的组成部分之一，损失函数在训练阶段协助更新CNN模型的参数。一般来说，不同的损失函数可以帮助卷积神经网络（CNN）学习不同的特征表示，不同的特征表示可以被视为从对象中学习到的不同知识。在本文中，我们介绍了一种新颖的训练框架，即损失转移（LT），以提高 CNN 的泛化能力。 LT包含多个训练阶段，每个训练阶段使用不同的损失函数。在此框架下，CNN 模型可以通过将通过一个损失函数学到的知识转移到另一个损失函数来组合对象的不同知识。 LT包含两个组成部分，即损失函数集和训练策略。为了构建适当的损失函数集，我们建立了两个基本指南。并根据这些基本指导，我们在CNN模型的最后一层（softmax操作之前的层）设计了一个新的损失函数，即近分类器超平面（N-CHP）损失，这使得学习到的对象特征属于同一个类别具有最小类内距离并且靠近分类器超平面。基于两个损失函数集（MSE，softmaxl和（N-CHP，softmaxl），我们设置了两种特定的训练方法，LTMSE，softmax和LTN-CHP，softmax，它们可以普遍应用于不同的CNN模型，附加计算成本低同时，进一步提出了两种训练策略：多阶段策略1和多阶段策略2，以提高LT的训练效率。在 MNIST、SVHN、CIFAR-10 和 CIFAR-100 等四个基准数据集上对浅层、中层和深层模型进行的大量实验结果表明，CNN 模型在与 LTMSE、softmax 和 LTN-CHP、softmax、这验证了 LT 和提出的两个基本指南的有效性。

更新日期：2020-04-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11