Abstract
Online learning is an important technical means for sketching massive real-time and high-speed data. Although this direction has attracted intensive attention, most of the literature in this area ignore the following three issues: (1) they think little of the underlying abstract hierarchical latent information existing in examples, even if extracting these abstract hierarchical latent representations is useful to better predict the class labels of examples; (2) the idea of preassigned model on unseen datapoints is not suitable for modeling streaming data with evolving probability distribution. This challenge is referred as “model flexibility”. And so, with this in minds, the online deep learning model we need to design should have a variable underlying structure; (3) moreover, it is of utmost importance to fusion these abstract hierarchical latent representations to achieve better classification performance, and we should give different weights to different levels of implicit representation information when dealing with the data streaming where the data distribution changes. To address these issues, we propose a two-phase Online Deep Learning based on Auto-Encoder (ODLAE). Based on auto-encoder, considering reconstruction loss, we extract abstract hierarchical latent representations of instances; Based on predictive loss, we devise two fusion strategies: the output-level fusion strategy, which is obtained by fusing the classification results of encoder’s each hidden layer; and feature-level fusion strategy, which is leveraged self-attention mechanism to fusion the every hidden layer’s output. Finally, in order to improve the robustness of the algorithm, we also try to utilize the denoising auto-encoder to yield hierarchical latent representations. Experimental results on different datasets are presented to verify the validity of our proposed algorithm (ODLAE) outperforms several baselines.
Similar content being viewed by others
References
Shen Y, Chen T, Giannakis GB (2019) Random. Feature-based online multi-kernel learning in environments with unknown dynamics. J Mach Learn Res 20:22:1–22:36
Saad-Moamar M, Abdelhamid B (2020) Online active learning for human activity recognition from sensory data streams. Neural Computation 390:341–358
Tianlin S, Jun Z (2017) Online bayesian passive-aggressive learning. J Mach Learn Res 18:33:1–33:39
Jesús López L, Ser JD, Bifet A et al (2020) Spiking neural networks and online learning: an overview and perspectives. Neural Netw 121:88–100
Lobo JL, Ibai L, Javier DS et al (2018) Evolving spiking neural networks for online learning over drifting data streams. Neural Netw 108:1–19
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408
Kivinen J, Smola AJ, Williamson RC (2004) Online learning with kernels. IEEE Trans Signal Process 52(8):2165–2176
Kim DW, Lee KY, Lee D, Lee KH (2005) Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recogn 38(4):607–611
Atto AM, Benoit A, Lambert P (2020) Timed-image based deep learning for action recognition in video sequences. Pattern Recogn 104:107353
Li C, Liu C, Duan L, Gao P, Zheng K (2020) Reconstruction regularized deep metric learning for multi-label image classification. IEEE Trans Neural Netw Learn 31(7):2294–2303
Taniguchi A, Hagiwara Y, Taniguchi T, Inamura T (2020) Improved and scalable online learning of spatial concepts and language models with mapping. Auton Robot 44(6):927–946
Sandbichler M, Schnass K (2019) Online and stable learning of analysis operators. IEEE Trans Signal Process 67(1):41–53
Li Z, Wei W, Zhang T, Wang M, Hou S, Peng X (2020) Online multi-expert learning for visual tracking. IEEE Trans Image Process 29:934–946
Chaudhry A, Dokania PK, Ajanthan T et al (2018) Riemannian walk for incremental learning: understanding forgetting and intransigence. In: Proceedings of the 15th European Conference Computer Vision(ECCV), pp 556–572
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D, Hadsell R (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526
Sang-Woo L, Jin-Hwa K, Jaehyun J et al (2017) Overcoming catastrophic forgetting by incremental moment matching. In: Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems( NIPS), pp 4655–4665
Le QV (2013) Building high-level features using large scale unsupervised learning. In: Proceedings of the IEEE International Conference on Acoustics, ICASSP, pp 8595–8598
Bengio Y, Lamblin P, Popovici D, et al. (2006) Greedy layer-wise training of deep networks. In: Proceedings of the 19th International Conference on Neural Information Processing System, pp 153–160
Masci J, Meier U, Ciresan D, Schmidhuber J (2011) Stacked convolutional auto-encoders for hierarchical feature extraction. In: Proceedings of the 21st International Conference on Artificial Neural Networks, ICANN, vol 6791, pp 52–59
Ashfahani A, Pratama M, Lughofer E, Ong YS (2020) DEVDAN: Deep evolving denoising autoencoder. Neurocomputing 390:297–314
Yoon J, Yang E, Lee J et al (2018) Lifelong learning with dynamically expandable networks. In: Proceedings of the 6th International Conference on Learning Representations, ICLR
Pratama M, Ashfahani A, Ong YS et al (2018) Autonomous deep learning: incremental learning of denoising autoencoder for evolving data streams, arXiv preprint arXiv:1809.09081
Ashfahani A, Pratama M (2019) Autonomous deep learning: continual learning approach for dynamic environments. In: Proceedings of the 2019 International Conference on Data Mining, SDM, pp 666–674
Bengio Y, Courville A, Vincent P (2013) Representation learning: A review and new perspectives. IEEE Trans- actions on Pattern Analysis and Machine Intelligence (T- PAMI) 35(8):1798–1828
Wong CM, Vong CM, Wong PK et al (2018) Kernel-based multilayer extreme learning machines for representation learning. IEEE Transactions on Neural Networks & Learning Systems 29(3):757–762
Guo T, Zhang L, Tan X, Yang L, Liang Z (2019) Data induced masking representation learning for face data analysis. Knowl-Based Syst 177:82–93
Dos SL, Benjamin P, Ludovic D et al (2018) Representation learning for classification in heterogeneous graphs with application to social networks. ACM Trans Knowl Discov Data 12(5):1–33
Zhou G, Sohn K, Lee H (2012) Online incremental feature learning with denoising autoencoders. J Mach Learn Res 22:1453–1461
Zeng N, Zhang H, Song B, Liu W, Li Y, Dobaie AM (2018) Facial expression recognition via learning deep sparse autoencoders. Neurocomputing 273:643–649
Vincent P, Larochelle H, Bengio Y et al (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, ICML, pp 1096–1103
Lin Z, Feng M, Santos CND et al (2017) A structured self-attentive sentence embedding. In: Proceedings of the 5th International Conference on Learning Representations, ICLR
Ashish V, Noam S, Niki P et al (2017) Attention is all you need. In: Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, NIPS, pp 5998–6008
Pears R, Sakthithasan S, Koh YS (2014) Detecting concept change in dynamic data streams. Mach Learn 97(3):259–293
Ali P, Herna V, Eric P (2018) Reservoir of diverse adaptive learners and stacking fast hoeffding drift detection methods for evolving data streams. Mach Learn 107(11):1711–1743
Bifet A (2017) Classifier concept drift detection and the illusion of progress. In: Proceedings of the International Conference on Artificial Intelligence and Soft Computing, ICAISC, Springer, pp 715–725
Zliobaite I, Budka M, Stahl F, (2015) Towards cost-sensitive adaptation: when is it worth updating your predictive model. Neurocomputing 150:240–249
Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR)
Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24:131–151
Madeo RCB, PERES SM, Lima CAM (2016) Gesture phase segmentation using support vector machines. Expert Syst Appl 56(9):100–115
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Lopez-Paz D, Ranzato MA (2017) Gradient episodic memory for continual learning. In: Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, NIPS, pp 6467–6467
Gama J, Ganguly A, Omitaomu O et al (2008) Knowledge discovery from data streams. Intell Data Anal 12(3):251–252
Ashfahani A, Pratama M, Lughofer E et al (2019) DEVDAN: deep evolving denoising autoencoder. Neurocomputing 390:297–314
Meidan Y, Bohadana M, Mathov Y et al (2018) Network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Comput 17(3):12–22
Yi L, Philip ML (2002) The relaxed online maximum margin algorithm. Mach Learn 46(1–3):361–387
Crammer K, Kulesza A, Dredze M (2013) Adaptive regularization of weight vectors. Mach Learn 91(2):155–187
Crammer K, Dredze M, Pereira F (2012) Confidence-weighted linear classification for text categorization. J Mach Learn Res 13:1891–1926
Jialei W, Peilin Z, Hoi SCH (2016) Soft confidence-weighted learning. ACM Trans Intell Sys Tech 8(1):15:1-15:32
Ying Y, Pontil M (2008) Online gradient descent learning algorithms. Found Comput Math 8(5):561–596
Crammer K, Dekel O, Keshet J, Shalev-Shwartz S et al (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585
Wang J, Zhao P (2014) LIBOL: a library for online learning algorithms. J Mach Learn Res 15:495–499
Sahoo D, Pham Q, Lu J et al (2018) Online deep learning: learning deep neural networks on the fly. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, pp 2660–2666
Acknowledgements
This work was supported by the National Key R&D Program of China (No. 2016YFC0303703).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Ss., Liu, Jw., Zuo, X. et al. Online deep learning based on auto-encoder. Appl Intell 51, 5420–5439 (2021). https://doi.org/10.1007/s10489-020-02058-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-02058-8