Abstract
Cross-Validation (CV) is still uncommon in time series modeling. Echo State Networks (ESNs), as a prime example of Reservoir Computing (RC) models, are known for their fast and precise one-shot learning, that often benefit from good hyper-parameter tuning. This makes them ideal to change the status quo. We discuss CV of time series for predicting a concrete time interval of interest, suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. This algorithm is presented as two levels of optimizations of doing k-fold CV. Training an RC model typically consists of two stages: (i) running the reservoir with the data and (ii) computing the optimal readouts. The first level of our optimization addresses the most computationally expensive part (i) and makes it remain constant irrespective of k. It dramatically reduces reservoir computations in any type of RC system and is enough if k is small. The second level of optimization also makes the (ii) part remain constant irrespective of large k, as long as the dimension of the output is low. We discuss when the proposed validation schemes for ESNs could be beneficial, three options for producing the final model and empirically investigate them on six different real-world datasets, as well as do empirical computation time experiments. We provide the code in an online repository. Proposed CV schemes give better and more stable test performance in all the six different real-world datasets, three task types. Empirical run times confirm our complexity analysis. In most situations, k-fold CV of ESNs and many other RC models can be done for virtually the same time and space complexity as a simple single-split validation. This enables CV to become a standard practice in RC.
Similar content being viewed by others
Notes
Publicly available at https://data.bls.gov/timeseries/lns14000000
Publicly available at https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=wgfupus2&f=W
Publicly available at http://www.sidc.be/silso/datafiles
Publicly available at https://www.physionet.org/physiobank/database/mitdb/
Publicly available at https://archive.ics.uci.edu/ml/datasets/Japanese+Vowels
References
Jaeger H. The “echo state” approach to analysing and training recurrent neural networks. Tech. Rep. GMD Report 148, German National Research Center for Information Technology. 2001. http://www.faculty.jacobs-university.de/hjaeger/pubs/EchoStatesTechRep.pdf
Jaeger H, Haas H. Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667), 78–80. 2004. 10.1126/science.1091277. http://www.faculty.jacobs-university.de/hjaeger/pubs/ESNScience04.pdf
Jaeger H. Echo state network. Scholarpedia 2(9), 2330. 2007. http://www.scholarpedia.org/article/Echo_state_network
Lukoševičius M, Jaeger H. Reservoir computing approaches to recurrent neural network training. Computer Science Review 3(3), 127–149. 2009. 10.1016/j.cosrev.2009.03.005. http://www.faculty.jacobs-university.de/hjaeger/pubs/2261_LukoseviciusJaeger09.pdf
Stone M. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: series B (Methodological). 1974;36(2):111–33.
Woodbury MA. Inverting modified matrices. Tech. Rep. Memorandum Rept. 42, MR38136, Statistical Research Group, Princeton University, Princeton, NJ. 1950.
Sherman J, Morrison WJ. Adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix (abstract). Ann. Math. Statist. 20(4), 620–624 (1949). https://doi.org/10.1214/aoms/1177729959. http://www.bmii.ktu.lt/files/bi2010/Pranesimai/a29_V-7_BMIK2010_53.pdf
Jaeger H. Adaptive nonlinear system identification with echo state networks. In: Advances in Neural Information Processing Systems 15 (NIPS 2002). MIT Press, Cambridge, MA. 2003. pp. 593–600.
Zhao Y, Wang K. Fast cross validation for regularized extreme learning machine. J Syst Eng Electron. 25(5), 895. 2014. https://doi.org/10.1109/JSEE.2014.00103. http://www.bmii.ktu.lt/files/bi2010/Pranesimai/a29_V-7_BMIK2010_53.pdf
Shao Z, Er MJ, Wang N. An efficient leave-one-out cross-validation-based extreme learning machine (eloo-elm) with minimal user intervention. IEEE Transactions on Cybernetics. 2016;46(8):1939–51. https://doi.org/10.1109/TCYB.2015.2458177.
Lukoševičius M, Uselis A. Efficient cross-validation of echo state networks. In: Artificial Neural Networks and Machine Learning –ICANN 2019: Workshop and Special Sessions. ICANN 2019., Lecture Notes in Computer Science, vol. 11731. Springer, Cham 2019. pp. 121–133. https://doi.org/10.1007/978-3-030-30493-512. https://link.springer.com/chapter/10.1007/978-3-030-30493-5_12. Presentation slides included
Lukoševičius M. A practical guide to applying echo state networks. In: G. Montavon, G.B. Orr, K.R. Müller (eds.) Neural Networks: tricks of the Trade, 2nd Edition, LNCS, vol. 7700. Springer 2012. pp. 659–686. https://doi.org/10.1007/978-3-642-35289-8. http://dx.doi.org/10.1007/978-3-642-35289-8_36
Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Statistics surveys. 2010;4:40–79.
Cerqueira V, Torgo L, Mozetič I. Evaluating time series forecasting models: An empirical study on performance estimation methods. Mach Learn. 2020. pp. 1–32. https://doi.org/10.1007/s10994-020-05910-7
Yildiz IB, Jaeger H, Kiebel SJ. Re-visiting the echo state property. Neural Netw 35, 1–9. 2012. https://doi.org/10.1016/j.neunet.2012.07.005. http://www.bmii.ktu.lt/files/bi2010/Pranesimai/a29_V-7_BMIK2010_53.pdf
Racine J. Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. J Econ. 2000;99(1):39–61.
Bergmeir C, Hyndman RJ, Koo B. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal. 2018;120:70–83.
Daukantas S, Lukoševičius M, Marozas V, Lukoševičius A. Comparison of “black box” and “gray box” methods for lost data reconstruction in multichannel signals. In: Proceedings of the 14th International Conference “Biomedical Engineering”. Kaunas 2010. pp. 135–138. http://www.bmii.ktu.lt/files/bi2010/Pranesimai/a29_V-7_BMIK2010_53.pdf
Doya K. Bifurcations in the learning of recurrent neural networks. In: Proceedings of IEEE International Symposium on Circuits and Systems. vol. 6, 1992. pp. 2777–2780.
Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw. 1994;5(2):157–66.
Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: International conference on machine learning. 2013. pp. 1310–1318.
Dawid AP. Present position and potential developments: Some personal views statistical theory the prequential approach. J R Stat Soc Ser A (General). 1984;147(2):278–90.
Taylor JW. Short-term electricity demand forecasting using double seasonal exponential smoothing. Journal of the Operational Research Society. 2003;54(8):799–805. http://www.bmii.ktu.lt/files/bi2010/Pranesimai/a29_V-7_BMIK2010_53.pdf
Rozelot J. On the chaotic behaviour of the solar activity. Astronomy and Astrophysics. 1995;297:L45.
Moody G, Mark R. The impact of the MIT-BIH arrhythmia database. IEEE Engineering in Medicine and Biology Magazine. 2001;20(3):45–50. https://doi.org/10.1109/51.932724.
Jaeger H, Lukoševičius M, Popovici D, Siewert U. Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw. 2007;20(3):335–52.
Funding
This research was supported by the Research, Development and Innovation Fund of Kaunas University of Technology (grant No. PP-91K/19).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
About this article
Cite this article
Lukoševičius, M., Uselis, A. Efficient Implementations of Echo State Network Cross-Validation. Cogn Comput 15, 1470–1484 (2023). https://doi.org/10.1007/s12559-021-09849-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-021-09849-2