Accelerating ELM training over data streams

Ji, Hangxu; Wu, Gang; Wang, Guoren

doi:10.1007/s13042-020-01158-8

Accelerating ELM training over data streams

Original Article
Published: 24 August 2020

Volume 12, pages 87–102, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Hangxu Ji¹,
Gang Wu¹ &
Guoren Wang²

327 Accesses
1 Citation
Explore all metrics

Abstract

In the field of machine learning, offline training and online training occupy the same important position because they coexist in many real applications. The extreme learning machine (ELM) has the characteristics of fast learning speed and high accuracy for offline training, and online sequential ELM (OS-ELM) is a variant of ELM that supports online training. With the explosive growth of data volume, running these algorithms on distributed computing platforms is an unstoppable trend, but there is currently no efficient distributed framework to support both ELM and OS-ELM. Apache Flink is an open-source stream-based distributed platform for both offline processing and online data processing with good scalability, high throughput, and fault-tolerant ability, so it can be used to accelerate both ELM and OS-ELM. In this paper, we first research the characteristics of ELM, OS-ELM and distributed computing platforms, then propose an efficient stream-based distributed framework for both ELM and OS-ELM, named ELM-SDF, which is implemented on Flink. We then evaluate the algorithms in this framework with synthetic data on distributed cluster. In summary, the advantages of the proposed framework are highlighted as follows. (1) The training speed of FLELM is always faster than ELM on Hadoop and Spark, and its scalability behaves better as well. (2) Response time and throughput of FLOS-ELM achieve better performance than OS-ELM on Hadoop and Spark when the incremental training samples arrive. (3) The response time and throughput of FLOS-ELM behave better in native-stream processing mode when the incremental data samples are continuously arriving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Iqbal H. Sarker

A survey on semi-supervised learning

Article Open access 15 November 2019

Jesper E. van Engelen & Holger H. Hoos

A survey of transfer learning

Article Open access 28 May 2016

Karl Weiss, Taghi M. Khoshgoftaar & DingDing Wang

References

Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. IEEE Int Joint Conf Neural Netw 2:985–990
Google Scholar
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501
Article Google Scholar
Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw Off J Int Neural Netw Soc 61(C):32–48
Article Google Scholar
Ding S, Zhao H, Zhang Y, Xinzheng X, Nie R (2015) Extreme learning machine: algorithm, theory and applications. Artif Intell Rev 44(1):103–115
Article Google Scholar
Wang Y, Cao F, Yuan Y (2011) A study on effectiveness of extreme learning machine *. Neurocomputing 74(16):2483–2490
Article Google Scholar
Zhang R, Lan Y, Huang GB, Zong Ben X (2012) Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Trans Neural Netw Learn Syst 23(2):365–371
Article Google Scholar
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Article Google Scholar
Guang Bin Huang and Lei Chen (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468
Google Scholar
Liang N, Huang G, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423
Article Google Scholar
Rong HJ, Huang GB, Sundararajan N, Saratchandran P (2009) Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans Syst Man Cybern Part B 39(4):1067–1072
Article Google Scholar
Zhao J, Wang Z, Dong SP (2012) Online sequential extreme learning machine with forgetting mechanism. Neurocomputing 87(15):79–89
Article Google Scholar
Wang X, Han M (2014) Online sequential extreme learning machine with kernels for nonstationary time series prediction. Neurocomputing 145(145):90–97
Article Google Scholar
Scardapane S, Comminiello D, Scarpiniti M, Uncini A (2015) Online sequential extreme learning machine with kernels. IEEE Trans Neural Netw Learn Syst 26(9):2214–2220
Article MathSciNet Google Scholar
Dong X, Li B, Zhang S (2018) An online sequential multiple hidden layers extreme learning machine method with forgetting mechanism. Chemom Intell Lab Syst 176:126–133
Article Google Scholar
Ding S, Zhang N, Zhang J, Xinzheng X, Shi Z (2017) Unsupervised extreme learning machine with representational features. Int J Mach Learn Cybern 8(2):587–595
Article Google Scholar
Zhang N, Ding S (2017) Unsupervised and semi-supervised extreme learning machine with wavelet kernel for high dimensional data. Memet Comput 9(2):129–139
Article Google Scholar
Zhang N, Ding S, Zhang J (2016) Multi layer elm-rbf for multi-label learning. Appl Soft Comput 43(C):535–545
Article Google Scholar
Ding S, Zhang N, Xinzheng X, Guo L, Zhang J (2015) Deep extreme learning machine and its application in eeg classification. Math Prob Eng. https://doi.org/10.1155/2015/129021
Xi ZW, Tianlun Z, Ran W (2017) Noniterative deep learning: Incorporating restricted boltzmann machine into multilayer random weight neural networks. In: IEEE transactions on systems man and cybernetics systems, pp 1–10
Zhang J, Ding S, Zhang N, Shi Z (2016) Incremental extreme learning machine based on deep feature embedded. Int J Mach Learn Cybern 7(1):111–120
Article Google Scholar
Cao K, Wang G, Han D, Ning J, Zhang X (2015) Classification of uncertain data streams based on extreme learning machine. Cogn Comput 7(1):150–160
Article Google Scholar
Shuliang X, Wang J (2017) Dynamic extreme learning machine for data stream classification. Neurocomputing 238(C):433–449
Google Scholar
Bi X, Zhao X, Ma W, Zhang Z, Heng Z (2016) Record linkage for event identification in xml feeds stream using elm. Proc ELM 1:463–476
Google Scholar
Asterios K, Sebastian S (2016) Apache flink: Stream analytics at scale. In: IEEE International Conference on Cloud Engineering Workshop, pp 193–193
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
Matei Z, Mosharaf C, Michael JF, Scott S, Ion S (2010) Spark: Cluster computing with working sets. In: Proceedings of the 2Nd USENIX conference on hot topics in cloud computing, pp 10–10
He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on mapreduce. Neurocomputing 102(2):52–58
Article Google Scholar
Xin J, Wang Z, Chen C, Ding L, Wang G, Zhao Y (2014) Elm * : distributed extreme learning machine with mapreduce. World Wide Web-Internet Web Inf Syst 17(5):1189–1204
Article Google Scholar
Liu T, Fang Z, Chen Z, Zhou Y (2016) Parallelization of a series of extreme learning machine algorithms based on spark. In: International conference on computer and information science. https://doi.org/10.1109/ICIS.2016.7550906
Huang S, Wang B, Chen Y, Wang G, Ge Y (2016) An efficient parallel method for batched os-elm training using mapreduce. Memet Comput 9(3):1–15
Google Scholar
Deng S, Wang B, Huang S, Yue C, Zhou J, Wang G (2017) Self-adaptive framework for efficient stream data classification on storm. IEEE Trans Syst Man Cybern Syst 50(1):123–136
Article Google Scholar
Sun Y, Yuan Y, Wang G (2011) An os-elm based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443
Article Google Scholar
Ning K, Liu M, Dong M (2015) A new robust elm method based on a bayesian framework with heavy-tailed distribution and weighted likelihood function. Neurocomputing 149(B):891–903
Article Google Scholar
Bi X, Zhao X, Wang G, Zhang P, Wang C (2015) Distributed extreme learning machine with kernels based on mapreduce. Neurocomputing 149(A):456–463
Article Google Scholar
Xin J, Wang Z, Luxuan Q, Wang G (2015) Elastic extreme learning machine for big data classification. Neurocomputing 149(A):464–471
Article Google Scholar
Sarwar JM, Juwel R, Marcelo M (2016) Open source initiatives and frameworks addressing distributed real-time data analytics. In: IEEE international parallel and distributed processing symposium workshops, pp 1481–1484
Banerjee KS (1971) Generalized inverse of matrices and its applications. Technometrics 15(1):197–197
Article Google Scholar
Zhao YP (2016) Parsimonious kernel extreme learning machine in primal via cholesky factorization. Neural Netw 80:95–109
Article Google Scholar
Deng WY, Bai Z, Huang GB, Zheng QH (2016) A fast svd-hidden-nodes based extreme learning machine for large-scale data analytics. Neural Netw 77(1):14–28
Article Google Scholar
Wang B, Huang S, Qiu J, Liu Y, Wang G (2015) Parallel online sequential extreme learning machine based on mapreduce. Neurocomputing 149(A):224–232
Article Google Scholar
Akusok A, Bjork KM, Miche Y, Lendasse A (2015) High performance extreme learning machines: a complete toolbox for big data applications. IEEE Access 3:1011–1025
Article Google Scholar

Download references

Acknowledgements

This research is partially funded by the National Key Research and Development Program of China (Grant No. 2016YFC1401900), the National Natural Science Foundation of China (Grant Nos. 61872072, 61572119, 61572121, 61622202, 61732003, 61729201, 61702086, and U1401256), the Fundamental Research Funds for the Central Universities (Grant Nos. N171604007, and N171904007), the Natural Science Foundation of Liaoning Province (Grant No. 20170520164), and the China Postdoctoral Science Foundation (Grant No. 2018M631806).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Northeastern University, Shenyang, China
Hangxu Ji & Gang Wu
School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Guoren Wang

Authors

Hangxu Ji
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Guoren Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Wu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Informed consent

Informed consent was obtained from all individual participants.

Human and Animal Rights

This article does not contain any studies involving human participants and/or animals by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, H., Wu, G. & Wang, G. Accelerating ELM training over data streams. Int. J. Mach. Learn. & Cyber. 12, 87–102 (2021). https://doi.org/10.1007/s13042-020-01158-8

Download citation

Received: 31 January 2019
Accepted: 12 June 2020
Published: 24 August 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s13042-020-01158-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating ELM training over data streams

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

A survey on semi-supervised learning

A survey of transfer learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Informed consent

Human and Animal Rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating ELM training over data streams

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

A survey on semi-supervised learning

A survey of transfer learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Informed consent

Human and Animal Rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation