Skip to main content

Advertisement

Log in

Genetic programming-based regression for temporal data

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

Various machine learning techniques exist to perform regression on temporal data with concept drift occurring. However, there are numerous nonstationary environments where these techniques may fail to either track or detect the changes. This study develops a genetic programming-based predictive model for temporal data with a numerical target that tracks changes in a dataset due to concept drift. When an environmental change is evident, the proposed algorithm reacts to the change by clustering the data and then inducing nonlinear models that describe generated clusters. Nonlinear models become terminal nodes of genetic programming model trees. Experiments were carried out using seven nonstationary datasets and the obtained results suggest that the proposed model yields high adaptation rates and accuracy to several types of concept drifts. Future work will consider strengthening the adaptation to concept drift and the fast implementation of genetic programming on GPUs to provide fast learning for high-speed temporal data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. A. Tsymbal, The problem of concept drift: definitions and related work. Comput. Sci. Dep, Trinity Coll Dublin 106(2), 58 (2004)

    Google Scholar 

  2. T. Mitsa, Temporal Data Mining, Chapman & Hall/CRC Data Mining and Knowledge Discovery Series (2010)

  3. J. Brownlee, A gentle introduction to concept drift in machine learning. Mach. Learn. Mastery (2018)

  4. L. Khan, W. Fan, In international conference on database systems for advanced applications, in Tutorial: Data Stream Mining and its Applications (Springer, Berlin, Heidelberg, 2012), pp. 328–329

  5. E. Lughofer, On-line active learning: a new paradigm to improve practical useability of datastream modeling methods. Inf. Sci. 415, 356–376 (2017)

    Article  Google Scholar 

  6. Z. Zhang, J. Zhou, Transfer estimation of evolving class priors in data stream classification. Pattern Recogn. 43(9), 3151–3161 (2010)

    Article  Google Scholar 

  7. J. Gama, I. Žliobaite, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1-44:37 (2014)

    Article  Google Scholar 

  8. R. Elwell, R. Polikar, Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  9. C. Alippi, G. Boracchi, M. Roveri, Just in time classifiers: managing the slow drift case, in Proc. Int. Joint Conf. Neural Networks (2009), pp. 114–120

  10. L. Torrey, J. Shavlik, Transfer Learning, in Handbook of Research on Machine Learning Applications. ed. by J.M.R.M.M.M.A.A.S.E. Soria (IGI Global, 2009)

  11. J.C. Schlimmer, R.H. Granger, Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)

    Google Scholar 

  12. G. Ditzler, M. Roveri, C. Alippi, Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)

    Article  Google Scholar 

  13. S. Delany, P. Cunningham, A. Tsymbal, L. Coyle, A case-based technique for tracking concept drift in spam filtering. Knowl. Based Syst 18(4–5), 187–195 (2005)

    Article  Google Scholar 

  14. C. Alippi, Intelligence for Embedded Systems (Springer, Berlin, 2014).

    Book  Google Scholar 

  15. J. Sarnelle, A. Sanchez, R. Capo, J. Haas, R. Polikar, Quantifying the limited and gradual concept drift assumption, in 2015 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2015), pp. 1–8

  16. L.I. Kuncheva, Classifier ensembles for changing environments, in Proc. 5th Int Workshop of Multiple Classifier Systems (2004), pp. 1–15

  17. G. Brown, J.L. Wyatt, P. Tino, Managing diversity in regression ensembles. J. Mach. Learn. Res. 6, 1621–1650 (2005)

    MathSciNet  MATH  Google Scholar 

  18. M. Basseville, I.V. Nikiforov, Detection of Abrupt Changes: Theory and Application, vol. 104 (Prentice-Hall, Englewood Cliffs, 1993).

    MATH  Google Scholar 

  19. A. Tsymbal, M. Pechenizkiy, P. Cunningham, S. Puuronen, Dynamic integration of classifiers for handling concept drift. Inform. Fusion 9(1), 56–68 (2008)

    Article  Google Scholar 

  20. J.R. Koza, Genetic programming: a paradigm for genetically breeding populations of computer programs to solve problems. Stanford University Computer Science Department Technical Report STAN-CS-90-1314 (1990)

  21. S. Massimo, A. Tettamanzi, Genetic programming for financial time series prediction, in Genetic Programming (Springer, 2001), pp. 361–370

  22. M. Kľúčik, J. Juriova, M. Kľúčik, Time series modeling with genetic programming relative to ARIMA models, in Conferences on New Techniques and Technologies for Statistics (2009), pp. 17–27

  23. P.G. Espejo, S. Ventura, F. Herrera, A Survey on the Application of Genetic Programming to Classification. IEEE Trans. Syst., Man, Cybern., Part C, Appl. Rev. 40(2), 121–144 (2010)

    Article  Google Scholar 

  24. K. Nag, N. Pal, A Multiobjective genetic programming-based ensemble for simultaneous feature selection and classification. IEEE Trans. Cybern. 46(2), 499–510 (2016)

    Article  Google Scholar 

  25. L. Vanneschi, G. Cuccu, A study of genetic programming variable population size for dynamic optimization problems, in IJCCI (2009), pp. 119–126

  26. Z. Yin, A. Brabazon, C. O’Sullivan, M. O’Neill, Genetic programming for dynamic environments, in 2nd international symposium advances in artificial intelligence and applications, vol. 2, pp. 437–446

  27. M. Rieket, K. M. Malan, and A. P. Engelbrecht, Adaptive genetic programming for dynamic classification problems, in 2009 IEEE congress on evolutionary computation (2009), pp. 674–681

  28. N. Wagner, Z. Michalewicz, M. Khouja, R. McGregor, Time series forecasting for dynamic environments: the DyFor genetic program model. IEEE Trans. Evol. Comput. 11(4), 433–452 (2007)

    Article  Google Scholar 

  29. S. Kelly, J. Newsted, W. Banzhaf, C. Gondro, A modular memory framework for time series prediction, in Proceedings of the 2020 Genetic and Evolutionary Computation Conference (2020), pp. 949–957

  30. A.J. Turner, J.F. Miller, Recurrent cartesian genetic programming of artificial neural networks. Genet. Progr. Evol. Mach. 18(2), 185–212 (2017)

    Article  Google Scholar 

  31. N.R. Draper, H. Smith, Applied Regression Analysis, vol. 326 (Wiley, New York, 1998).

    Book  Google Scholar 

  32. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics, 2nd edn. (Springer, 2009).

  33. J.B. Fraleigh, R.A. Beauregard, Linear Algebra, 3rd edn. (Addison-Wesley Publishing Company, Upper Saddle River, 1995).

    MATH  Google Scholar 

  34. S.M. Stigler, Gauss and the invention of least squares. Ann. Stat. 9(3), 465–474 (1981)

    Article  MathSciNet  Google Scholar 

  35. A. Kordon, Future trends in soft computing industrial applications, in Proceedings of the 2006 IEEE Congress on Evolutionary Computation (2006), pp. 7854–7861

  36. E. Alfaro-Cid, A.I. Esparcia-Alcázar, P. Moya, B. Femenia-Ferrer, K. Sharman, J.J. Merelo, Modeling pheromone dispensers using genetic programming, in Lecture Notes in Computer Science, vol 5484 (Springer, Berlin/Heidelberg, 2008), pp. 635–644

  37. D.P. Searson, D.E. Leahy, M.J. Willis, GPTIPS: an open-source genetic programming toolbox for multigene symbolic regression, in Proceedings of the International Multiconference of Engineers and Computer Scientists, vol 1 (Citeseer, 2010), pp. 77–80

  38. N.Q. Uy, N.X. Hoai, M. O’Neill, R.I. McKay, E. Galván-López, Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet. Progr. Evol. Mach. 12(2), 91–119 (2011)

    Article  Google Scholar 

  39. K. Georgieva, A.P. Engelbrecht, dynamic differential evolution algorithm for clustering temporal data. Large Scale Sci. Comput., Lect. Notes Comput. Sci. 8353, 240–247 (2014)

    Google Scholar 

  40. C. Kuranga, Genetic programming approach for nonstationary data analytics. Ph.D Thesis, University of Pretoria, Pretoria, South Africa (2020)

  41. R. Poli, W.B. Langdon, N.F. McPhee, A field guide to genetic programming. Lulu Enterprise, UK Ltd, http://lulu.com (2008)

  42. L. Vanneschi, R. Poli, Genetic programming: introduction, application, theory and open issues, in Handbook of Natural Computing: Theory, Experiments and Applications. ed. by T.B.A.J.K. Grzegorz Rosenberg (Springer, Berlin, 2010)

    Google Scholar 

  43. W. Banzhaf, P. Nordin, R.E. Keller, F.D. Francone, Genetic Programming: An Introduction, vol. 1 (Morgan Kaufmann, San Francisco, 1998).

    Book  Google Scholar 

  44. A. Canoa, B. Krawczyk, Evolving rule-based classifiers with genetic programming on GPUs for drifting data streams. Pattern Recogn. 87, 248–268 (2019)

    Article  Google Scholar 

  45. A. Soundarrajan, S. Sumathi, G. Sivamurugan, Voltage and frequency control in power generating system using hybrid evolutionary algorithms. J. Vib. Control 18(2), 214–227 (2012)

    Article  Google Scholar 

  46. MATLAB, version 8.5.0 (R2015a) (The MathWorks Inc., Natick, MA, 2015)

  47. R.W. Morrison, Performance measure in dynamic environments, in GECCO Workshop on Evolutionary Algorithms for Dynamic Optimization Problems, No. 5–8 (2003)

  48. R.W. Morrison, K.A. De Jong, A test problem generator for non-stationary environments, in Proc. of the 1999 Congr. on Evol. Comput. (1999), pp. 2047–2053

  49. J. Branke, Memory enhanced evolutionary algorithms for changing optimization problems, in Proc. of the 1999 Congr. on Evol. Comput. (1999), pp. 1875–1882

  50. Y. Jin, B. Sendhoff, Constructing dynamic optimization test problems using the multiobjective optimization concept. EvoWorkshop 2004 LNCS 3005, 526–536 (2004)

    Google Scholar 

  51. C. Li, M. Yang, L. Kang, A new approach to solving dynamic TSP, in Proc of the 6th Int. Conf. on Simulated Evolution and Learning (2006), pp. 236–243

  52. C. Li, S. Yang, A generalized approach to construct benchmark problems for dynamic optimization, in Proc. of the 7th Int. Conf. on Simulated Evolution and Learning (Springer, Berlin, Heidelberg, 2008), pp. 391–400.

  53. L. Zhang, J. Lin, R. Karim, Sliding window-based fault detection from high-dimensional data streams. IEEE Trans. Syst., Man, Cybern.: Syst. 47(2), 289–303 (2017)

    Google Scholar 

  54. A.S. Rakitianskaia, A.P. Engelbrecht, Training Feedforward Neural Network with Dynamic Particle Swarm Optimisation (Computer Science Department, University of Pretoria, 2011).

  55. L. Bennett, L. Swartzendruber, H. Brown, Superconductivity Magnetization Modeling (National Institute of Standards and Technology (NIST), US Department of Commerce, USA, 1994).

    Google Scholar 

  56. V. Cherkassky, D. Gehring, F. Mulier, Comparison of adaptive methods for function estimation from samples. IEEE Trans. Neural Netw. 7(4), 969–984 (1996)

    Article  Google Scholar 

  57. M. Harries, Splice-2 comparative evaluation: electricity pricing. Technical Report UNSW-CSE-TR-9905, Artificial Intelligence Group, School of Computer Science and Engineering, The University of New South Wales, Sydney 2052, Australia (1999)

  58. R.J. Shiller, Stock Market Data Used in Irrational Exuberance (Princeton University Press, 2005).

  59. J. Kitchen, R. Monaco, Real-time forecasting in practice. Bus. Econ.: J. Natl. Assoc. Bus. Econ. 38, 10–19 (2003)

    Google Scholar 

  60. M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez-Cáceres, T. Stützle, M. Birattari, The irace package: Iterated racing for automatic algorithm configuration. Oper. Res. Perspect. 3, 43–58 (2016)

    MathSciNet  Google Scholar 

  61. X. Qiu, P.N. Suganthan, G.A. Amaratunga, Ensemble incremental learning random vector functional link network for short-term electric load forecasting. Knowl.-Based Syst. 145, 182–196 (2018)

    Article  Google Scholar 

  62. J. Che, J. Wang, Short-term electricity prices forecasting based on support vector regression and auto-regressive integrated moving average modeling. Energy Convers. Manage. 51(10), 1911–1917 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank reviewers for their valuable comments which immensely improved the structure of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cry Kuranga.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Area Editor: Una-May O'Reilly.

Appendix

Appendix

See Appendix Table

Table 7 The p-values for the Test Environment

7.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuranga, C., Pillay, N. Genetic programming-based regression for temporal data. Genet Program Evolvable Mach 22, 297–324 (2021). https://doi.org/10.1007/s10710-021-09404-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-021-09404-w

Keywords

Navigation