Abstract
Auto-encoders are unsupervised deep learning models, which try to learn hidden representations to reconstruct the inputs. While the learned representations are suitable for applications related to unsupervised reconstruction, they may not be optimal for classification. In this paper, we propose a supervised auto-encoder (SupAE) with an addition classification layer on the representation layer to jointly predict targets and reconstruct inputs, so it can learn discriminative features specifically for classification tasks. We stack several SupAE and apply a greedy layer-by-layer training approach to learn the stacked supervised auto-encoder (SSupAE). Then an adaptive weighted majority voting algorithm is proposed to fuse the prediction results of SupAE and the SSupAE, because each individual SupAE and the final SSupAE can both get the posterior probability information of samples belong to each class, we introduce Shannon entropy to measure the classification ability for different samples based on the posterior probability information, and assign high weight to sample with low entropy, thus more reasonable weights are assigned to different samples adaptively. Finally, we fuse the different results of classification layer with the proposed adaptive weighted majority voting algorithm to get the final recognition results. Experimental results on several classification datasets show that our model can learn discriminative features and improve the classification performance significantly.
Similar content being viewed by others
References
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444
He KM, Zhang XY, Ren SQ et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Yu J, Tao DC, Wang M, Rui Y (2015) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779
Hong CQ, Yu J, Zhang J, Jin XN, Lee KH (2019) Multi-modal face pose estimation with multi-task manifold deep learning. IEEE Trans Ind Inform 15(7):3952–3961
Hong CQ, Yu J, Wan J, Tao DC, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Hong CQ, Yu J, Chen XH (2013) Image-based 3D human pose recovery with locality sensitive sparse retrieval. In: Proceedings of the 2013 IEEE international conference on systems, man, and cybernetics, pp 2103–2108
Yu J, Tan M, Zhang HY, Tao DC, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058
Fayek HM, Margaret L, Lawrence C (2017) Evaluating deep learning architectures for speech emotion recognition. Neural Netw 92:60–68
Young T, Hazarika D, Poria S, Cambria E (2017) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75
Bengio Y, Lamblin P, Dan P, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Proceedings of the advances in neural information processing systems, vol 19, pp 153–160
Le QV, Ngiam I, Coates A, Lahiri A, Prochnow, B, Ng AY (2011) On optimization methods for deep learning. In: Proceedings of the 28th international conference on machine learning, pp 265–272
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the 28th international conference on machine learning, pp 833–840
Rifai S, Mesnil G, Vincent P, Muller X, Bengio Y, Dauphin Y, Glorot X (2011) Higher order contractive auto-encoder. In: Joint European conference on machine learning and knowledge discovery in databases, pp 645–660
Liu WF, Yang XH, Tao DP, Cheng J, Tang YY (2014) Multiview dimension reduction via Hessian multiset canonical correlations. Inf Fusion 41:119–128
Ma M, Sun C, Chen X (2018) Deep coupling autoencoder for fault diagnosis with multimodal sensory data. IEEE Trans Ind Inform 14(3):1137–1145
Grozdic DT, Jovicic ST (2017) Whispered speech recognition using deep denoising autoencoder and inverse filtering. IEEE Trans Audio Speech Lang Process 25(12):2313–2322
Dai Y, Wang G (2018) Analyzing tongue images using a conceptual alignment deep autoencoder. IEEE Access 6(3):1137–1145
Park D, Hoshi Y, Kemp CC (2018) A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder. IEEE Robot Autom Lett 3(3):1544–1551
Du F, Zhang JS, Ji NN, Hu JY, Zhang CX (2019) Discriminative representation learning with supervised auto-encoder. Neural Process Lett 49(2):507–520
Singh M, Nagpal S, Singh R, Vatsa M (2017) Class representative autoencoder for low resolution multi-spectral gender classification. In: International joint conference on neural networks, pp 1026–1033
Gao SH, Zhang YT, Jia K, Lu JW, Zhang YY (2015) Single sample face recognition via learning deep supervised autoencoders. IEEE Trans Inf Forensics Secur 10:2108–2118
Sankaran A, Vatsa M, Singh R, Majumdar A (2017) Group sparse autoencoder. Image Vis Comput 60:64–74
Liu W, Ma T, Xie Q, Tao D, Cheng J (2017) LMAE: a large margin auto-encoders for classification. Signal Process 141:137–143
Jia K, Sun L, Gao S, Song Z, Shi BE (2015) Laplacian auto-encoders: an explicit learning of nonlinear data manifold. Neurocomputing 160:250–260
Liao YY, Wang Y, Liu Y (2017) Graph regularized auto-encoders for image representation. IEEE Trans Image Process 26(6):2839
Razakarivony S, Jurie F (2014) Discriminative autoencoders for small targets detection. In: 2014 22nd International conference on pattern recognition, pp 3528–3533
Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747
Costa VS, Farias ADS, Bedregal B, Regivan HNS, Canuto AMDP (2018) Combining multiple algorithms in classifier ensembles using generalized mixture functions. Neurocomputing 313:402–414
Aburomman AA, Reaz MBI (2017) A survey of intrusion detection systems based on ensemble and hybrid classifiers. Comput Secur 65:135–152
Wang XD, Song YF (2018) Uncertainty measure in evidence theory with its applications. Appl Intell 48(7):1672–1688
Song YF, Wang XD, Zhu JW, Lei L (2018) Sensor dynamic reliability evaluation based on evidence and intuitionistic fuzzy sets. Appl Intell 48(11):3950–3962
Zhao KK, Matsukawa T, Suzuki E (2019) Experimental validation for N-ary error correcting output codes for ensemble learning of deep neural networks. J Intell Inf Syst 52(2):367–392
Lei L, Song YF, Luo X (2019) A new re-encoding ECOC using a reject option. Appl Intell. https://doi.org/10.1007/s10489-020-01642-2
Lam L, Suen SY (1997) Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Trans Syst Man Cybern Part A Syst Hum 27(5):553–568
Lingenfelser F, Wagner J, André E (2011) A systematic discussion of fusion techniques for multi-modal affect recognition tasks. In: Proceedings of the 13th international conference on multimodal interfaces, pp 19–26
Catal C, Tufekci S, Pirmit E, Kocabag G (2015) On the use of ensemble of classifiers for accelerometer-based activity recognition. Appl Soft Comput 37:1018–1022
Xia MM, Xu ZS (2012) Entropy/cross entropy-based group decision making under intuitionistic fuzzy environment. Inf Fusion 13(1):31–47
Song YF, Fu Q, Wang YF, Wang XD (2019) Divergence-based cross entropy and uncertainty measures of Atanassov’s intuitionistic fuzzy sets with their application in decision making. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2019.105703
Dua D, Graff C (2019) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine, CA. http://archive.ics.uci.edu/ml
Kingma, DP, Ba, J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Maaten LVD, Hinton GE (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grants 61806219, 61876189, 61503407, 61703426, 61273275. This work is also supported by Young Talent fund of University Association for Science and Technology in Shaanxi, China, No. 20190108.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, R., Wang, X., Quan, W. et al. Stacked Fusion Supervised Auto-encoder with an Additional Classification Layer. Neural Process Lett 51, 2649–2667 (2020). https://doi.org/10.1007/s11063-020-10223-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10223-w