Abstract
Gradient method is often used for the feedforward neural network training. Most of the studies so far have been focused on the square error function. In this paper, a novel entropy error function is proposed for the feedforward neural network training. The week and strong convergence analysis of the gradient method based on the entropy error function with batch input training patterns is strictly proved. Numerical examples are also given by the end of the paper for verifying the effectiveness and correctness. Compared with the square error function, our method provides both faster learning speed and better generalization for the given test problems.
Similar content being viewed by others
References
Zhang H, Tang Y (2017) Online gradient method with smoothing \(l_0\) regularization for feedforward neural networks. Neuocomputing 224(10):1–8
Li F, Zurada J, Wu W (2018) Smooth group \(\text{ L}_{\frac{1}{2}}\) regularization for input layer of feedforward neural networks. Neural Netw 314(7):109–119
Chen Z (2019) Convergence of neutral type fuzzy cellular neural networks with D operator. Neural Process Lett 49:1189–1199
Tian Y, Wang Z (2020) \(\text{ H}_{\infty }\) Performance state estimation for static neural networks with time-varying delays via two improved inequalities. Express Briefs. IEEE Trans Circuit Syst II. https://doi.org/10.1109/TCSII.2020
Liu J, Zhang Y, Yu Y et al (2019) Fixed-time event-triggered consensus for nonlinear multiagent systems without continuous communications. IEEE Trans Syst Man Cybern Syst 49(11):2221–2229
Tian Y, Wang Z (2020) Stability analysis for delayed neural networks based on the augmented Lyapunov-Krasovskii functional with delay-product-type and multiple integral terms. Neurocomputing. https://doi.org/10.1016/j.neucom.2020.05.045
Liu J, Zhang Y, Yu Y et al (2020) Fixed-time leader-follower consensus of networked nonlinear systems via event/self-triggered control. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2957069
Xu Y, Chen Q (2013) Convergence of gradient method for training ridge polynomial neural network. Neural Comput Appl 22(1):333–339
Zhang H, Wu W (2011) Convergence of split-complex backpropagation algorithm with a momentum. Neural Netw World 21(1):75–90
Li L, Qiao Z, Long Z (2020) A smoothing algorithm with constant learning rate for training two Kinds of fuzzy neural networks and its convergence. Neural Process Lett 51:1093–1109
Huang C, Bingwen Liu B, Tian X et al (2019) Global convergence on asymptotically almost periodic SICNNs with nonlinear decay functions. Neural Process Lett 49:625–641
Xu D, Dong J, Zhang H (2017) Deterministic convergence of wirtinger-gradient methods for complex-valued neural networks. Neural Process Lett 45:445–456
Karayiannis NB, Venetsanopoulos AN (1992) Fast learning algorithms for neural networks. IEEE Trans Circuit Syst II Analog Digit Signal Process 39(7):453–474
Oh SH (1997) Improving the error back propagation algorithm with a modified error function. IEEE Trans Neural Netw 8(3):799–802
Lin KWE, Balamurali BT, Koh E et al (2020) Singing voice separation using a deep convolutional neural network trained by ideal binary mask and cross entropy. Neural Comput Appl 32:1037–1050
Shan B, Fang Y (2020) A cross entropy based deep neural network model for road extraction from satellite images. Entropy 22:535–551
Bahri A, Majelan SG, Mohammadi S et al (2020) Remote sensing image classification via improved cross-entropy loss and transfer learning strategy based on deep convolutional neural networks. IEEE Geosci Remote Sens Lett 17(6):1087–1091
Song D, Zhang Y, Shan X et al (2017) Over-Learning phenomenon of wavelet neural networks in remote sensing image classifications with different entropy error functions. Entropy 19:101–119
Bosman AS, Engelbrecht A, Helbig M (2020) Visualising basins of attraction for the cross-entropy and the squared error neural network loss functions. Neurocomputing 400:113–136
Yuan Y, Sun W (2001) Optimization theory and methods. Science Press, Beijing
Gorman R, Sejnowski T (1988) UCI machine learning repository. Irvine, University of California https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)
Yontem M, Adem K, Ilhan T, et al (2019) UCI Machine learning repository. Irvine, University of California https://archive.ics.uci.edu/ml/datasets/Divorce+Predictors+data+set
Kurgan L, Cios K, Tadeusiewicz R, et al (2001) UCI machine learning repository. Irvine, University of California https://archive.ics.uci.edu/ml/datasets/SPECT+Heart
Waldemar W, Koczkoda J (2018) UCI machine learning repository. Irvine, University of California https://archive.ics.uci.edu/ml/datasets/Somerville+Happiness+Survey
Patrcio M, Pereira J, Crisstomo J, et al (2018) UCI Machine Learning Repository. Irvine, University of California https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra
Acknowledgements
The authors would like to thank the anonymous referees for their helpful comments and suggestions to improve the presentation of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiong, Y., Tong, X. Convergence of Batch Gradient Method Based on the Entropy Error Function for Feedforward Neural Networks. Neural Process Lett 52, 2687–2695 (2020). https://doi.org/10.1007/s11063-020-10374-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10374-w