当前位置: X-MOL 学术Artif. Intell. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A systematic review on overfitting control in shallow and deep neural networks
Artificial Intelligence Review ( IF 12.0 ) Pub Date : 2021-03-03 , DOI: 10.1007/s10462-021-09975-1
Mohammad Mahdi Bejani , Mehdi Ghatee

Shallow neural networks process the features directly, while deep networks extract features automatically along with the training. Both models suffer from overfitting or poor generalization in many cases. Deep networks include more hyper-parameters than shallow ones that increase the overfitting probability. This paper states a systematic review of the overfit controlling methods and categorizes them into passive, active, and semi-active subsets. A passive method designs a neural network before training, while an active method adapts a neural network along with the training process. A semi-active method redesigns a neural network when the training performance is poor. This review includes the theoretical and experimental backgrounds of these methods, their strengths and weaknesses, and the emerging techniques for overfitting detection. The adaptation of model complexity to the data complexity is another point in this review. The relation between overfitting control, regularization, network compression, and network simplification is also stated. The paper ends with some concluding lessons from the literature.



中文翻译:

浅层和深层神经网络中的过拟合控制的系统综述

浅层神经网络可直接处理特征,而深层网络会随着训练自动提取特征。在许多情况下,两种模型都存在过度拟合或泛化能力差的问题。相比于浅层网络,深层网络包含更多的超参数,从而增加了过拟合的可能性。本文阐述了过拟合控制方法的系统综述,并将其分为被动,主动和半主动子集。被动方法在训练之前设计神经网络,而主动方法则在训练过程中调整神经网络。当训练性能较差时,半主动方法会重新设计神经网络。这篇综述包括了这些方法的理论和实验背景,它们的优缺点,以及新兴的过拟合检测技术。模型复杂度与数据复杂度之间的适应性是本文的另一重点。还说明了过度拟合控制,正则化,网络压缩和网络简化之间的关系。本文以文献中的一些总结性教训作为结尾。

更新日期:2021-03-04
down
wechat
bug