当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Theoretical Insight Into the Effect of Loss Function for Deep Semantic-Preserving Learning
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-07-20 , DOI: 10.1109/tnnls.2021.3090358
Ali Akbari 1 , Muhammad Awais 1 , Manijeh Bashar 2 , Josef Kittler 1
Affiliation  

Good generalization performance is the fundamental goal of any machine learning algorithm. Using the uniform stability concept, this article theoretically proves that the choice of loss function impacts the generalization performance of a trained deep neural network (DNN). The adopted stability-based framework provides an effective tool for comparing the generalization error bound with respect to the utilized loss function. The main result of our analysis is that using an effective loss function makes stochastic gradient descent more stable which consequently leads to the tighter generalization error bound, and so better generalization performance. To validate our analysis, we study learning problems in which the classes are semantically correlated. To capture this semantic similarity of neighboring classes, we adopt the well-known semantics-preserving learning framework, namely label distribution learning (LDL). We propose two novel loss functions for the LDL framework and theoretically show that they provide stronger stability than the other widely used loss functions adopted for training DNNs. The experimental results on three applications with semantically correlated classes, including facial age estimation, head pose estimation, and image esthetic assessment, validate the theoretical insights gained by our analysis and demonstrate the usefulness of the proposed loss functions in practical applications.

中文翻译:


深度语义保留学习损失函数影响的理论见解



良好的泛化性能是任何机器学习算法的基本目标。本文利用一致稳定性概念,从理论上证明了损失函数的选择会影响经过训练的深度神经网络(DNN)的泛化性能。采用的基于稳定性的框架提供了一种有效的工具,用于比较泛化误差范围与所使用的损失函数。我们分析的主要结果是,使用有效的损失函数使随机梯度下降更加稳定,从而导致更严格的泛化误差范围,从而获得更好的泛化性能。为了验证我们的分析,我们研究了类别在语义上相关的学习问题。为了捕获相邻类的语义相似性,我们采用了著名的语义保留学习框架,即标签分布学习(LDL)。我们为 LDL 框架提出了两种新颖的损失函数,并从理论上证明它们比用于训练 DNN 的其他广泛使用的损失函数提供了更强的稳定性。具有语义相关类别的三个应用(包括面部年龄估计、头部姿势估计和图像美学评估)的实验结果验证了我们的分析所获得的理论见解,并证明了所提出的损失函数在实际应用中的有用性。
更新日期:2021-07-20
down
wechat
bug