Skip to main content
Log in

Model learning: a survey of foundations, tools and applications

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Software systems are present all around us and playing their vital roles in our daily life. The correct functioning of these systems is of prime concern. In addition to classical testing techniques, formal techniques like model checking are used to reinforce the quality and reliability of software systems. However, obtaining of behavior model, which is essential for model-based techniques, of unknown software systems is a challenging task. To mitigate this problem, an emerging black-box analysis technique, called Model Learning, can be applied. It complements existing model-based testing and verification approaches by providing behavior models of blackbox systems fully automatically. This paper surveys the model learning technique, which recently has attracted much attention from researchers, especially from the domains of testing and verification. First, we review the background and foundations of model learning, which form the basis of subsequent sections. Second, we present some well-known model learning tools and provide their merits and shortcomings in the form of a comparison table. Third, we describe the successful applications of model learning in multidisciplinary fields, current challenges along with possible future works, and concluding remarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Blair M, Obenski S, Bridickas P. Patriot missile defense: software problem led to system failure at dhahran. Report GAO/IMTEC-92-26, 1992

  2. Lions J L. Ariane 5 flight 501 failure. Report by the Inquiry Board, 1996

  3. Stephenson A G, Mulville D R, Bauer F H, Dukeman G A, Norvig P, LaPiana L S, Rutledge P J, Folta D, Sackheim R. Mars Climate Orbiter Mishap Investigation Board Phase I Report. NASA, Washington, DC, 1999

    Google Scholar 

  4. Leveson N G, Turner C S. An investigation of the therac-25 accidents. Journal of Computer, 1993, 26(7): 18–41

    Google Scholar 

  5. Coe T. Inside the pentium FDIV bug. DR Dobbs Journal, 1995, 20(4): 129

    Google Scholar 

  6. Ball T, Rajamani S K. The slam project: debugging system software via static analysis. In: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 2002, 1–3

  7. Henzinger T A, Jhala R, Majumdar R, Sutre G. Lazy abstraction. Journal of ACM SIGPLAN Notices, 2002, 37(1): 58–70

    Article  MATH  Google Scholar 

  8. Walkinshaw N, Bogdanov K, Ali S, Holcombe M. Automated discovery of state transitions and their functions in source code. Journal of Software Testing, Verification and Reliability, 2008, 18(2): 99–121

    Article  Google Scholar 

  9. Walkinshaw N, Bogdanov K, Holcombe M, Salahuddin S. Reverse engineering state machines by interactive grammar inference. In: Proceedings of Working Conference on Reverse Engineering. 2007, 209–218

  10. Biermann A W, Krishnaswamy R. Constructing programs from example computations. Journal of IEEE Transactions on Software Engineering, 1976, 3: 141–153

    Article  MathSciNet  MATH  Google Scholar 

  11. Muller-Olm M, Schmidt D A, Steffen B. Model-checking: a tutorial introduction. In: Proceedings of International Static Analysis Symposium. 1999, 330–354

  12. Clarke E M, Grumberg O, Peled D. Model Checking. MIT Press, 1999

  13. Baier C, Katoen J P. Principles of Model Checking. MIT Press, 2008

  14. Broy M, Jonsson B, Katoen J P, Leucker M, Pretschner A. Model-based testing of reactive systems. Lecture Notes in Computer Science, 2005

  15. Utting M, Legeard B. Practical Model-based Testing: a Tools Approach. Elsevier, 2010

  16. Arbab F. Reo: a channel-based coordination model for component composition. Journal of Mathematical Structures in Computer Science, 2004, 14(3): 329–366

    Article  MathSciNet  MATH  Google Scholar 

  17. Ball T, Bounimova E, Cook B, Levin V, Lichtenberg J, McGarvey C, Ondrusek B, Rajamani S K, Ustuner A. Thorough static analysis of device drivers. Journal of ACM SIGOPS Operating Systems Review, 2006, 40(4): 73–85

    Article  Google Scholar 

  18. Hungar H, Niese O, Steffen B. Domain-specific optimization in automata learning. In: Proceedings of International Conference on Computer Aided Verification. 2003, 315–327

  19. Margaria T, Niese O, Raffelt H, Steffen B. Efficient test-based model generation for legacy reactive systems. In: Proceedings of the 9th IEEE International High-Level Design Validation and Test Workshop. 2004, 95–100

  20. Muccini H, Polini A, Ricci F, Bertolino A. Monitoring architectural properties in dynamic component-based systems. In: Proceedings of International Symposium on Component-Based Software Engineering. 2007, 124–139

  21. Shahbaz M. Reverse engineering enhanced state models of black box software components to support integration testing. Doctoral Dissertation, 2008

  22. Shu G, Lee D. Testing security properties of protocol implementations-a machine learning based approach. In: Proceedings of the 27th International Conference on Distributed Computing Systems. 2007

  23. Aarts F, Vaandrager F. Learning I/O automata. In: Proceedings of International Conference on Concurrency Theory. 2010, 71–85

  24. Moore E F. Gedanken-experiments on sequential machines. Journal of Automata Studies, 1956, 34: 129–153

    MathSciNet  Google Scholar 

  25. Berg T, Grinchtein O, Jonsson B, Leucker M, Raffelt H, Steffen B. On the correspondence between conformance testing and regular inference. In: Proceedings of International Conference on Fundamental Approaches to Software Engineering. 2005, 175–189

  26. Hagerer A, Hungar H, Niese O, Steffen B. Model generation by moderated regular extrapolation. In: Proceedings of International Conference on Fundamental Approaches to Software Engineering. 2002, 80–95

  27. Isberner M. Foundations of active automata learning: an algorithmic perspective. Doctoral Dissertation, 2015

  28. De Ruiter J, Poll E. Protocol state fuzzing of TLS implementations. In: Proceedings of the 24th USENIX Security Symposium. 2015, 193–206

  29. Vaandrager F. Model learning. Journal of Communications of ACM, 2017, 60(2): 86–95

    Article  Google Scholar 

  30. Peled D, Vardi M Y, Yannakakis M. Black box checking. Journal of Automata, Languages and Combinatorics, 2002, 7(2): 225–246

    MathSciNet  MATH  Google Scholar 

  31. Ammons G, Bodík R, Larus J R. Mining specifications. Journal of ACM Sigplan Notices, 2002, 37(1): 4–16

    Article  MATH  Google Scholar 

  32. Lo D, Khoo S C. Smartic: towards building an accurate, robust and scalable specification miner. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 2006, 265–275

  33. Lorenzoli D, Mariani L, Pezzè M. Automatic generation of software behavioral models. In: Proceedings of the 30th International Conference on Software Engineering. 2008, 501–510

  34. Bennaceur A, Meinke K. Machine learning for software analysis: models, methods, and applications. In: Bennaceur A, Hähnle R, Meinke K, eds. Machine Learning for Dynamic Software Analysis: Potentials and Limits. Springer, Cham, 2018

    Chapter  Google Scholar 

  35. Starkie B, Zaanen V M, Estival D. The tenjinno machine translation competition. In: Proceedings of International Colloquium on Grammatical Inference. 2006, 214–226

  36. Starkie B, Coste F, Zaanen V M. The omphalos context-free grammar learning competition. In: Proceedings of International Colloquium on Grammatical Inference. 2004, 16–27

  37. Walkinshaw N, Lambeau B, Damas C, Bogdanov K, Dupont P. Stamina: a competition to encourage the development and assessment of software model inference techniques. Journal of Empirical Software Engineering, 2013, 18(4): 791–824

    Article  Google Scholar 

  38. Combe D, De La Higuera C, Janodet J C. Zulu: an interactive learning competition. In: Proceedings of International Workshop on Finite-State Methods and Natural Language Processing. 2009, 139–146

  39. Angluin D. Learning regular sets from queries and counterexamples. Journal of Information and Computation, 1987, 75(2): 87–106

    Article  MathSciNet  MATH  Google Scholar 

  40. Gold E M. Language identification in the limit. Information and Control, 1967, 10(5): 447–474

    Article  MathSciNet  MATH  Google Scholar 

  41. Valiant L G. A theory of the learnable. Journal of Communications of ACM, 1984, 27(11): 1134–1142

    Article  MATH  Google Scholar 

  42. Angluin D, Smith C H. Inductive inference: theory and methods. Journal of ACM Computing Surveys (CSUR), 1983, 15(3): 237–269

    Article  MathSciNet  Google Scholar 

  43. Gasarch W I, Smith C H. Learning via queries. Journal of ACM (JACM), 1992, 39(3): 649–674

    Article  MathSciNet  MATH  Google Scholar 

  44. Watanabe O. A framework for polynomial-time query learnability. Journal of Mathematical Systems Theory, 1994, 27(3): 211–229

    Article  MathSciNet  MATH  Google Scholar 

  45. Blumer A, Ehrenfeucht A, Haussler D, Warmuth M K. Learnability and the vapnik-chervonenkis dimension. Journal of ACM (JACM), 1989, 36(4): 929–965

    Article  MathSciNet  MATH  Google Scholar 

  46. Pitt L, Valiant L G. Computational limitations on learning from examples. Journal of ACM (JACM), 1988, 35(4): 965–984

    Article  MathSciNet  MATH  Google Scholar 

  47. Pitt L, Warmuth M K. Prediction-preserving reducibility. Journal of Computer and System Sciences, 1990, 41(3): 430–467

    Article  MathSciNet  MATH  Google Scholar 

  48. Angluin D, Kharitonov M. When won’t membership queries help? Journal of Computer and System Sciences, 1995, 50(2): 336–355

    Article  MathSciNet  MATH  Google Scholar 

  49. Steffen B, Howar F, Merten M. Introduction to active automata learning from a practical perspective. In: Proceedings of International School on Formal Methods for the Design of Computer, Communication and Software Systems. 2012, 256–296

  50. Rivest R L, Schapire R E. Inference of finite automata using homing sequences. Journal of Information and Computation, 1993, 103(2): 299–347

    Article  MathSciNet  MATH  Google Scholar 

  51. Groz R, Simao A, Petrenko A, Oriat C. Inferring finite state machines without reset using state identification sequences. In: Proceedings of International Conference on Testing Software and Systems. 2015, 161–177

  52. Groz S AR, Petrenko A, Oriat C. Inferring fsm models of systems without reset. In: Bennaceur A, Hähnle R, Meinke K, eds. Machine Learning for Dynamic Software. Springer, Cham, 2018

    Google Scholar 

  53. Hopcroft J E, Motwani R, Ullman J D. Introduction to automata theory, languages, and computation. ACM Sigact News, 2001, 32(1): 60–65

    Article  MATH  Google Scholar 

  54. Niese O. An integrated approach to testing complex systems. Doctoral Disseration, Technical University of Dortmund, Germany, 2003

  55. Shahbaz M, Groz R. Inferring mealy machines. In: Proceedings of International Symposium on Formal Methods. 2009, 207–222

  56. Shahbaz M, Li K, Groz R. Learning parameterized state machine model for integration testing. In: Proceedings of Computer Software and Applications Conference. 2007, 755–760

  57. Aarts F, Kuppens H, Tretmans J, Vaandrager F, Verwer S. Improving active mealy machine learning for protocol conformance testing. Journal of Machine Learning, 2014, 96(1–2): 189–224

    Article  MathSciNet  MATH  Google Scholar 

  58. Walkinshaw N, Derrick J, Guo Q. Iterative refinement of reverse-engineered models by model-based testing. In: Proceedings of International Symposium on Formal Methods. 2009, 305–320

  59. Groz R, Li K, Petrenko A, Shahbaz M. Modular system verification by inference, testing and reachability analysis. In: Suzuki K, Higashino T, Ulrich A, Hasegawa T, eds. Testing of Software and Communicating Systems. Springer, Berlin, Hedelberg, 2008

    Google Scholar 

  60. Huima A. Implementing conformiq qtronic. In: Petrenko A, Veanes M, Tretmans J, Griekamp W, eds. Testing of Software and Communicating Systems. Springer, Berlin, Hedelberg, 2007

    Google Scholar 

  61. Jhala R, Majumdar R. Software model checking. Journal of ACM Computing Surveys (CSUR), 2009, 41(4): 21

    Google Scholar 

  62. Howar F, Steffen B, Jonsson B, Cassel S. Inferring canonical register automata. In: Proceedings of International Workshop on Verification, Model Checking, and Abstract Interpretation. 2012, 251–266

  63. Howar F, Isberner M, Steffen B, Bauer O, Jonsson B. Inferring semantic interfaces of data structures. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2012, 554–571

  64. Merten M, Howar F, Steffen B, Cassel S, Jonsson B. Demonstrating learning of register automata. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2012, 466–471

  65. Isberner M, Howar F, Steffen B. Learning register automata: from languages to program structures. Journal of Machine Learning, 2014, 96(1–2): 65–98

    Article  MathSciNet  MATH  Google Scholar 

  66. Cassel S, Howar F, Jonsson B, Steffen B. Active learning for extended finite state machines. Journal of Formal Aspects of Computing, 2016, 28(2): 233–263

    Article  MathSciNet  MATH  Google Scholar 

  67. Maler O, Mens I E. Learning regular languages over large alphabets. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2014, 485–499

  68. Argyros G, D’Antoni L. The learnability of symbolic automata. In: Proceedings of International Conference on Computer Aided Verification. 2018, 427–445

  69. Drews S, D’Antoni L. Learning symbolic automata. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2017, 173–189

  70. Argyros G, Stais I, Kiayias A, Keromytis A D. Back in black: towards formal, black box analysis of sanitizers and filters. In: Proceedings of IEEE Symposium on Security and Privacy. 2016, 91–109

  71. Veanes M, Hooimeijer P, Livshits B, Molnar D, Bjorner N. Symbolic finite state transducers: algorithms and applications. In: Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 2012, 137–150

  72. Veanes M. Applications of symbolic finite automata. In: Proceedings of International Conference on Implementation and Application of Automata. 2013, 16–23

  73. Argyros G, Stais I, Jana S, Keromytis A D, Kiayias A. Sfadiff: automated evasion attacks and fingerprinting using black-box differential automata learning. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 2016, 1690–1701

  74. Alur R, Dill D L. A theory of timed automata. Journal of Theoretical Computer Science, 1994, 126(2): 183–235

    Article  MathSciNet  MATH  Google Scholar 

  75. Grinchtein O, Jonsson B, Leucker M. Learning of event-recording automata. Journal of Theoretical Computer Science, 2010, 411(47): 4029–4054

    Article  MathSciNet  MATH  Google Scholar 

  76. Clark A, Thollard F. Pac-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 2004, 5(May): 473–497

    MathSciNet  MATH  Google Scholar 

  77. Castro J, Gavaldà R. Towards feasible pac-learning of probabilistic deterministic finite automata. In: Proceedings of International Colloquium on Grammatical Inference. 2008, 163–174

  78. Yokomori T. Learning Non-Deterministic Finite Automata From Queries and Counterexamples. Oxford University Press, 1994

  79. Denis F, Lemay A, Terlutte A. Learning regular languages using non deterministic finite automata. In: Proceedings of International Colloquium on Grammatical Inference. 2000, 39–50

  80. Alur R, Courcoubetis C, Henzinger T A, Ho P H. Hybrid automata: an algorithmic approach to the specification and verification of hybrid systems. In: Grossman R L, Herode A, Ravn A P, eds. Hybrid Systems. Springer, Berlin, Heideberg, 1993

    Google Scholar 

  81. Van Der Aalst W, Adriansyah A, De Medeiros A K A, Arcieri F, Baier T, Blickle T, Bose J C, Brand V. d P, Brandtjen R, Buijs J, et al. Process mining manifesto. In: Proceedings of International Conference on Business Process Management. 2011, 169–194

  82. de la Higuera C, Janodet J C. Inference of w-languages from prefixes. Theoretical Computer Science, 2004, 313(2): 295–312

    Article  MathSciNet  MATH  Google Scholar 

  83. Hopcroft J E, Motwani R, Ullman J D. Introduction to automata theory, languages, and computation. Journal of ACM Sigact News, 2001, 32(1): 60–65

    Article  MATH  Google Scholar 

  84. Bollig B, Habermehl P, Kern C, Leucker M. Angluin-style learning of NFA. In: Proceedings of International Joint Conference on Artificial Intelligence. 2009, 1004–1009

  85. Merten M, Howar F, Steffen B, Margaria T. Automata learning with on-the-fly direct hypothesis construction. In: Leveraging Applications of Formal Methods, Verification, and Validation. Springer, 2012

  86. Maler O, Pnueli A. On the learnability of infinitary regular sets. Journal of Information and Computation, 1995, 118(2): 316–326

    Article  MathSciNet  MATH  Google Scholar 

  87. Irfan M N, Oriat C, Groz R. Model inference and testing. In: Advances in Computers. Elsevier, 2013

  88. Kearns M J, Vazirani U V. An Introduction to Computational Learning Theory. MIT Press, 1994

  89. Isberner M, Howar F, Steffen B. The TTT algorithm: a redundancy-free approach to active automata learning. In: Proceedings of International Conference on Runtime Verification. 2014, 307–322

  90. Groce A, Peled D, Yannakakis M. Adaptive model checking. Logic Journal of the IGPL, 2006, 14(5): 729–744

    Article  MathSciNet  MATH  Google Scholar 

  91. Smeenk W, Moerman J, Vaandrager F, Jansen D N. Applying automata learning to embedded control software. In: Proceedings of International Conference on Formal Engineering Methods. 2015, 67–83

  92. Lee D, Yannakakis M. Principles and methods of testing finite state machines-a survey. Proceedings of IEEE, 1996, 84(8): 1090–1123

    Article  Google Scholar 

  93. Bos V P, Smetsers R, Vaandrager F. Enhancing automata learning by log-based metrics. In: Proceedings of International Conference on Integrated Formal Methods. 2016, 295–310

  94. Chen Y F, Hsieh C, Lengál O, Lii T J, Tsai M H, Wang B Y, Wang F. PAC learning-based verification and model synthesis. In: Proceedings of the 38th International Conference on Software Engineering. 2016, 714–724

  95. Chow T S. Testing software design modeled by finite-state machines. IEEE Transactions on Software Engineering, 1978, 3: 178–187

    Article  MATH  Google Scholar 

  96. Fujiwara S, Bochmann G V, Khendek F, Amalou M, Ghedamsi A. Test selection based on finite state models. IEEE Transactions on Software Engineering, 1991, 17(6): 591–603

    Article  Google Scholar 

  97. Shen Y, Lombardi F, Dahbura A T. Protocol conformance testing using multiple UIO sequences. IEEE Transactions on Communications, 1992, 40(8): 1282–1287

    Article  Google Scholar 

  98. Vuong S T. The uiov-method for protocol test sequence generation. In: Proceedings of the 2nd IFIP International Workshop on Protocol Test Systems. 1989, 161–175

  99. Aarts F, Jonsson B, Uijen J. Generating models of infinite-state communication protocols using regular inference with abstraction. In: Proceedings of IFIP International Conference on Testing Software and Systems. 2010, 188–204

  100. Aarts F, Jonsson B, Uijen J, Vaandrager F. Generating models of infinite-state communication protocols using regular inference with abstraction. Journal of Formal Methods in System Design, 2015, 46(1): 1–41

    Article  MATH  Google Scholar 

  101. Cho C Y, Babić D, Shin E C R, Song D. Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communications Security. 2010, 426–439

  102. Chalupar G, Peherstorfer S, Poll E, De Ruiter J. Automated reverse engineering using lego®. In: Proceedings of the 8th ENIX Workshop on Offensive Technologies ({WOOT} 14). 2014

  103. Vaandrager F. Active learning of extended finite state machines. In: Proceedings of International Conference on Testing Software and Systems. 2012, 5–7

  104. Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager F. Automata learning through counterexample guided abstraction refinement. In: Proceedings of International Symposium on Formal Methods. 2012, 10–27

  105. Aarts F, Heidarian F, Vaandrager F. A theory of history dependent abstractions for learning interface automata. In: Proceedings of International Conference on Concurrency Theory. 2012, 240–255

  106. Gold E M. Complexity of automaton identification from given data. Journal of Information and Control, 1978, 37(3): 302–320

    Article  MathSciNet  MATH  Google Scholar 

  107. Isberner M, Steffen B, Howar F. Learnlib tutorial. In: Bartocci E, Majumdar R, eds. Runtime Verification. Springer, Cham, 2015, 358–377

    Chapter  Google Scholar 

  108. Xiao H. Automatic model learning and its applications in malware detection. Doctoral Dissertation, Nanyang Technological University, 2017

  109. Aarts F D. Tomte: bridging the gap between active learning and real-world systems. Doctoral Dissertation, Radboud University Nijmegen, 2014

  110. Fiterau-Brostean P. Active model learning for the analysis of network protocols. Doctoral Dissertation, Radboud University, 2018

  111. Irfan M N. Analysis and optimization of software model inference algorithms. Universita de Grenoble, Grenoble, France, 2012

    Google Scholar 

  112. Czerny M X. Learning-based software testing: evaluation of angluin’s L* algorithm and adaptations in practice. Batchelors Thesis, Karlsruhe Institute of Technology, 2014

  113. Henrix M. Performance improvement in automata learning. Master thesis, Radboud University, 2015

  114. Uijen J. Learning models of communication protocols using abstraction techniques. Master Thesis, Uppscola University, 2009

  115. Aarts F. Inference and abstraction of communication protocols. Master Thesis, Uppscola University, 2009

  116. Bohlin T, Jonsson B. Regular inference for communication protocol entities. Technical Report 2008–024, Uppsala University, Computer Systems, 2008

  117. Berg T. Regular inference for reactive systems. Doctoral Dissertation, Uppsala Universitet, 2006

  118. Berg T, Jonsson B, Leucker M, Saksena M. Insights to angluin’s learning. Journal of Electronic Notes in Theoretical Computer Science, 2005, 118: 3–18

    Article  MATH  Google Scholar 

  119. Irfan M N. State machine inference in testing context with long counterexamples. In: Proceedings of International Conference on Software Testing, Verification and Validation. 2010, 508–511

  120. Sidhu D P, Leung T K. Formal methods for protocol testing: a detailed study. IEEE Transactions on Software Engineering, 1989, 15(4): 413–426

    Article  Google Scholar 

  121. Dorofeeva R, El-Fakih K, Maag S, Cavalli A R, Yevtushenko N. Experimental evaluation of fsm-based testing methods. In: Proceedings of the 3rd IEEE International Conference on Software Engineering and Formal Methods. 2005, 23–32

  122. Aarts F, Howar F, Kuppens H, Vaandrager F. Algorithms for inferring register automata. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2014, 202–219

  123. Stevens P, Moller F. The edinburgh concurrency workbench user manual (version 7.1). Laboratory for Foundations of Computer Science, University of Edinburgh, 1999, 7

  124. Maler O, Pnueli A. On the learnability of infinitary regular sets. Information and Computation, 1991, 118: 316–326

    Article  MathSciNet  MATH  Google Scholar 

  125. Irfan M N, Oriat C, Groz R. Angluin style finite state machine inference with non-optimal counterexamples. In: Proceedings of the 1st International Workshop on Model Inference in Testing. 2010, 11–19

  126. Merten M, Howar F, Steffen B, Margaria T. Automata learning with on-the-fly direct hypothesis construction. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2011, 248–260

  127. Isberner M, Steffen B. An abstract framework for counterexample analysis in active automata learning. In: Proceedings of International Conference on Grammatical Inference. 2014, 79–93

  128. Li K, Groz R, Shahbaz M. Integration testing of distributed components based on learning parameterized I/O models. In: Proceedings of International Conference on Formal Techniques for Networked and Distributed Systems. 2006, 436–450

  129. Howar F. Active Learning of Interface Programs. Doctoral Dissertation, TU Dortmund University, 2012

  130. Cassel S, Howar F, Jonsson B, Steffen B. Learning extended finite state machines. In: Proceedings of International Conference on Software Engineering and Formal Methods. 2014, 250–264

  131. D’Antoni L, Veanes M. The power of symbolic automata and transducers. In: Proceedings of International Conference on Computer Aided Verification. 2017, 47–67

  132. Isberner M, Howar F, Steffen B. The open-source learnlib. In: Proceedings of International Conference on Computer Aided Verification. 2015, 487–495

  133. Isberner M, Howar F, Steffen B. The open-source learnlib. In: Proceedings of International Conference on Computer Aided Verification. 2015, 487–495

  134. Raffelt H, Steffen B. LearnLib a library for automata learning and experimentation. In: Proceedings of the 9th International Conference on Fundamental Approaches to Software Engineering. 2006, 377–380

  135. Isberner M, Howar F, Steffen B. Inferring automata with state-local alphabet abstractions. In: Proceedings of NASA Formal Methods Symposium. 2013, 124–138

  136. Hopcroft J E, Karp R M. A linear algorithm for testing equivalence of finite automata. Technical Report, Cornell University, 1971

  137. Giannakopoulou D, Rakamarić Z, Raman V. Symbolic learning of component interfaces. In: Proceedings of International Static Analysis Symposium. 2012, 248–264

  138. Howar F, Giannakopoulou D, Rakamarić Z. Hybrid learning: interface generation through static, dynamic, and symbolic analysis. In: Proceedings of the 2013 International Symposium on Software Testing and Analysis. 2013, 268–279

  139. Bainczyk A, Schieweck A, Isberner M, Margaria T, Neubauer J, Steffen B. Alex: mixed-mode learning of web applications at ease. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods. 2016, 655–671

  140. Botinčan M, Babić D. Sigma* symbolic learning of input-output specifications. Journal of ACM SIGPLAN Notices, 2013, 48(1): 443–456

    Article  MATH  Google Scholar 

  141. Bollig B, Katoen J P, Kern C, Leucker M, Neider D, Piegdon D R. Libalf: the automata learning framework. In: Proceedings of International Conference on Computer Aided Verification. 2010, 360–364

  142. Merten M, Steffen B, Howar F, Margaria T. Next generation learnlib. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2011, 220–223

  143. Cassel S, Howar F, Jonsson B. RAlib: A learnLib extension for inferring EFSMs. DIFTS, 2015

  144. Khalili A, Tacchella A. Aide: automata-identification engine. see Archive. codeplex Website, 2014

  145. Aarts F, Schmaltz J, Vaandrager F. Inference and abstraction of the biometric passport. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2010, 673–686

  146. Aarts F, De Ruiter J, Poll E. Formal models of bank cards for free. In: Proceedings of Software Testing, Verification and Validation Workshops. 2013, 461–468

  147. Smeenk W. Applying automata learning to complex industrial software. Master’s Thesis, Radboud University Nijmegen, 2012

  148. Janssen R, Vaandrager F W, Verwer S. Learning a state diagram of tcp using abstraction. Bachelor Thesis, ICIS, Radboud University Nijmegen, 2013, 12

  149. Fiterău-Broştean P, Janssen R, Vaandrager F. Learning fragments of the tcp network protocol. In: Proceedings of International Workshop on Formal Methods for Industrial Critical Systems. 2014, 78–93

  150. Tijssen M. Automatic modeling of ssh implementations with state machine learning algorithms. Bachelor thesis, Radboud University, Nijmegen, 2014

  151. Fiter T P, Janssen R, Vaandrager F. Combining model learning and model checking to analyze tcp implementations. In: Proceedings of International Conference on Computer Aided Verification. 2016, 454–471

  152. Fiterău-Broştean P, Lenaerts T, Poll E, Ruiter d J, Vaandrager F, Verleg P. Model learning and model checking of SSH implementations. In: Proceedings of the 24th ACM SIGSOFT International SPIN Symposium on Model Checking of Software. 2017, 142–151

  153. Schuts M, Hooman J, Vaandrager F. Refactoring of legacy software using model learning and equivalence checking: an industrial experience report. In: Proceedings of International Conference on Integrated Formal Methods. 2016, 311–325

  154. Neubauer J, Steffen B, Bauer O, Windmüller S, Merten M, Margaria T, Howar F. Automated continuous quality assurance. In: Proceedings of International Workshop on Formal Methods in Software Engineering: Rigorous and Agile Approaches. 2012, 37–43

  155. Windmüller S, Neubauer J, Steffen B, Howar F, Bauer O. Active continuous quality control. In: Proceedings of the 16th International ACM Sigsoft Symposium on Component-based Software Engineering. 2013, 111–120

  156. Verleg P, Poll E, Vaandrager F. Inferring SSH state machines using protocol state fuzzing. Masters Thesis, Radboud University, 2016

  157. Groce A, Peled D, Yannakakis M. Adaptive model checking. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2002, 269–301

  158. Li K, Groz R, Shahbaz M. Integration testing of components guided by incremental state machine learning. In: Proceedings of the Testing: Academic and Industrial Conference on Practice and Research Techniques. 2006, 59–70

  159. Shahbaz M, Groz R. Analysis and testing of black-box component-based systems by inferring partial models. Journal of Software Testing, Verification and Reliability, 2014, 24(4): 253–288

    Article  Google Scholar 

  160. Shahbaz M, Li K, Groz R. Learning and integration of parameterized components through testing. In: Petrenko A, Veanes M, Tretmans J, Grieskamp W, eds. Testing of Software and Communicating Systems. Springer, Berlin, Hedelberg, 2007

    Google Scholar 

  161. Shahbaz M, Parreaux B, Klay F. Model inference approach for detecting feature interactions in integrated systems. In: Proceedings of International Conference on Feature Interactions in Software and Communication Systems. 2007, 161–171

  162. Walkinshaw N, Bogdanov K, Derrick J, Paris J. Increasing functional coverage by inductive testing: a case study. In: Proceedings of International Conference on Testing Software and Systems. 2010, 126–141

  163. Cobleigh J M, Giannakopoulou D, Pasareanu C S. Learning assumptions for compositional verification. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2003,331–346

  164. Păsăreanu S C, Giannakopoulou D, Bobaru M G, Cobleigh J M, Barringer H. Learning to divide and conquer: applying the L* algorithm to automate assume-guarantee reasoning. Formal Methods in System Design, 2008, 32(3): 175–205

    Article  MATH  Google Scholar 

  165. He F, Mao S, Wang B Y. Learning-based assume-guarantee regression verification. In: Proceedings of International Conference on Computer Aided Verification. 2016, 310–328

  166. Chen Y F, Clarke E M, Farzan A, He F, Tsai M H, Tsay Y K, Wang B Y, Zhu L. Comparing learning algorithms in automated assume-guarantee reasoning. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2010, 643–657

  167. Chen Y F, Clarke E M, Farzan A, Tsai M H, Tsay Y K, Wang B Y. Automated assume-guarantee reasoning through implicit learning. In: Proceedings of International Conference on Computer Aided Verification. 2010, 511–526

  168. He F, Wang B Y, Yin L, Zhu L. Symbolic assume-guarantee reasoning through BDD learning. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 1071–1082

  169. Lin S W, Hsiung P A. Compositional synthesis of concurrent systems through causal model checking and learning. In: Proceedings of International Symposium on Formal Methods. 2014, 416–431

  170. Neider D, Topcu U. An automaton learning approach to solving safety games over infinite graphs. In: Proceedings of International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2016, 204–221

  171. Margaria T, Niese O, Steffen B, Erochok A. System level testing of virtual switch (re-) configuration over IP. In: Proceedings of the 7th IEEE European Test Workshop. 2002, 67–72

  172. Niese O, Steffen B, Margaria T, Hagerer A, Brune G, Ide H D. Library-based design and consistency checking of system-level industrial test cases. In: Proceedings of International Conference on Fundamental Approaches to Software Engineering, 2001, 233–248

  173. Feng L, Lundmark S, Meinke K, Niu F, Sindhu M A, Wong P Y. Case studies in learning-based testing. In: Proceedings of International Conference on Testing Software and Systems. 2013, 164–179

  174. Oliveira A L, Silva J P. Efficient algorithms for the inference of minimum size dfas. Journal of Machine Learning, 2001, 44(1–2): 93–119

    Article  MATH  Google Scholar 

  175. Choi W, Necula G, Sen K. Guided gui testing of android apps with minimal restart and approximate learning. ACM SIGPLAN Notices, 2013, 48(10): 623–640

    Article  Google Scholar 

  176. Alur R, Černý P, Madhusudan P, Nam W. Synthesis of interface specifications for java classes. ACM SIGPLAN Notices, 2005, 40(1): 98–109

    Article  MATH  Google Scholar 

  177. Xiao H, Sun J, Liu Y, Lin S W, Sun C. Tzuyu: learning stateful type-states. In: Proceedings of the 28th International Automated Software Engineering. 2013, 432–442

  178. Raffelt H, Steffen B, Margaria T. Dynamic testing via automata learning. In: Proceedings of Haifa Verification Conference. 2007, 136–152

  179. Clarke E, Long D, McMillan K. Compositional model checking. In: Proceedings of the 4th Annual Symposium on Logic in Computer Science. 1989

  180. Smeenk W, Vaandrager F W. Applying automata learning to complex industrial software. Master’s Thesis, Radboud University Nijmegen, 2012

  181. Tappler M, Aichernig B K, Bloem R. Model-based testing IoT communication via active automata learning. In: Proceedings of IEEE International Conference on Software Testing, Verification and Validation. 2017, 276–287

  182. Aarts F, Fiterau-Brostean P, Kuppens H, Vaandrager F. Learning register automata with fresh value generation. In: Proceedings of International Colloquium on Theoretical Aspects of Computing. 2015, 165–183

  183. Meinke K, Sindhu M A. Incremental learning-based testing for reactive systems. In: Proceedings of International Conference on Tests and Proofs. 2011, 134–151

  184. Volpato M, Tretmans J. Active learning of nondeterministic systems from an ioco perspective. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2014, 220–235

  185. Groz R, Irfan M N, Oriat C. Algorithmic improvements on regular inference of software models and perspectives for security testing. In: Proceedings of International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 2012, 444–457

  186. Ernst M D, Perkins J H, Guo P J, McCamant S, Pacheco C, Tschantz M S, Xiao C. The daikon system for dynamic detection of likely invariants. Journal of Science of Computer Programming, 2007, 69(1–3): 35–45

    Article  MathSciNet  MATH  Google Scholar 

  187. Nachmanson L, Veanes M, Schulte W, Tillmann N, Grieskamp W. Optimal strategies for testing nondeterministic systems. ACM SIGSOFT Software Engineering Notes, 2004, 29(4): 55–64

    Article  Google Scholar 

  188. Volpato M, Tretmans J. Approximate active learning of nondeterministic input output transition systems. Electronic Communications of the EASST, 2015, 72

  189. Khalili A, Tacchella A. Learning nondeterministic mealy machines. In: Proceedings of International Conference on Grammatical Inference. 2014, 109–123

  190. El-Fakih K, Groz R, Irfan M N, Shahbaz M. Learning finite state models of observable nondeterministic systems in a testing context. In: Proceedings of International Conference on Testing Software and Systems. 2010, 97–102

  191. Dallmeier V. Mining and checking object behavior. Doctoral Dissertation, 2010

  192. Van der Aalst W M P, Rubin V, Verbeek H M W, Dongen V B F, Kindler E, Günther C W. Process mining: a two-step approach to balance between underfitting and overfitting. Journal of Software & Systems Modeling, 2010, 9(1): 87

    Article  Google Scholar 

  193. Meinke K. CGE: a sequential learning algorithm for mealy automata. In: Proceedings of International Colloquium on Grammatical Inference. 2010, 148–162

  194. Peled D, Vardi M Y, Yannakakis M. Black box checking. In: Wu J, Chanson S T, Gao Q, eds. Formal Methods for Protocol Engineering and Distributed Systems. Springer, Boston, MA, 1999

    Google Scholar 

  195. Cassel S, Howar F, Jonsson B, Merten M, Steffen B. A succinct canonical register automaton model. In: Proceedings of International Symposium on Automated Technology for Verification and Analysis. 2011, 366–380

Download references

Acknowledgements

We would like to gratitude Mr. Naeem Irfan, Mr. Kashif Saghar, and Mr. Markus Frohme TU Dortmund for valuable discussions on Model learning. This work was supported by by the National Natural Science Foundation of China (NSFC) (Grant Nos. 61872016, 61932007 and 61972013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongwang Zhao.

Additional information

Shahbaz Ali received the MSc degree in Physics from Govt College University, Pakistan in 2000, and the MS degree in computer system engineering from Ghulam Ishaq Khan Institute, Pakistan in 2003. He is currently pursuing the PhD degree in computer software and theory at the School of Computer Science and Engineering, Beihang University, China. His research interest includes program analysis, model learning, model-based testing (MBT), learning-based software testing, formal verification, formal methods for software developments.

Hailong Sun received the BS degree in computer science from Beijing Jiaotong University, China in 2001. He received the PhD degree in computer software and theory from Beihang University, China in 2008. He is a professor in the School of Software, Beihang University, China. His research interests include intelligent software engineering, crowd intelligence/crowdsourcing and distributed systems. He is a member of the ACM and the IEEE.

Yongwang Zhao received the PhD degree in computer science from Beihang University, China in 2009. He is an associate professor at the School of Computer Science and Engineering, Beihang Univerisity, China. He has also been a Research Fellow in the School of Computer Science and Engineering, Nanyang Technological University, Singapore from 2015. His research interests include formal methods, OS kernels, information-flow security, and AADL.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ali, S., Sun, H. & Zhao, Y. Model learning: a survey of foundations, tools and applications. Front. Comput. Sci. 15, 155210 (2021). https://doi.org/10.1007/s11704-019-9212-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-019-9212-z

Keywords

Navigation