NLP-assisted software testing: A systematic mapping of the literature

https://doi.org/10.1016/j.infsof.2020.106321Get rights and content

Abstract

Context

To reduce manual effort of extracting test cases from natural-language requirements, many approaches based on Natural Language Processing (NLP) have been proposed in the literature. Given the large amount of approaches in this area, and since many practitioners are eager to utilize such techniques, it is important to synthesize and provide an overview of the state-of-the-art in this area.

Objective

Our objective is to summarize the state-of-the-art in NLP-assisted software testing which could benefit practitioners to potentially utilize those NLP-based techniques. Moreover, this can benefit researchers in providing an overview of the research landscape.

Method

To address the above need, we conducted a survey in the form of a systematic literature mapping (classification). After compiling an initial pool of 95 papers, we conducted a systematic voting, and our final pool included 67 technical papers.

Results

This review paper provides an overview of the contribution types presented in the papers, types of NLP approaches used to assist software testing, types of required input requirements, and a review of tool support in this area. Some key results we have detected are: (1) only four of the 38 tools (11%) presented in the papers are available for download; (2) a larger ratio of the papers (30 of 67) provided a shallow exposure to the NLP aspects (almost no details).

Conclusion

This paper would benefit both practitioners and researchers by serving as an “index” to the body of knowledge in this area. The results could help practitioners utilizing the existing NLP-based techniques; this in turn reduces the cost of test-case design and decreases the amount of human resources spent on test activities. After sharing this review with some of our industrial collaborators, initial insights show that this review can indeed be useful and beneficial to practitioners.

Introduction

Software testing is a fundamental activity to ensure a certain degree of quality in software systems. However, testing is an effort-intensive activity. In its conventional form, human testers (test engineers) conduct most (if not all) phases of software testing manually. One of those phases is test-case design in which the human tester uses written (formal) requirements, written often in natural language (NL), to derive a set of test cases. Test-case design is also an effort-intensive activity [1], and practitioners are eager to get help from any (partially) automated approach to extract test suites from requirements [1]. Such a practice could save software companies considerable resources which are regularly spent to manually derive and document test cases from requirements. Furthermore, as software requirements change, test cases have to be maintained, an activity which incurs further effort.

To reduce the manual effort of converting natural-language (NL) requirements into test cases, many approaches based on Natural Language Processing (NLP) have been proposed in the literature. Such an approach requires an input set of requirements written in NL. Then, following a series of NLP steps [2], a set of test cases are extracted automatically from the textual requirements. Let us clarify that we use the conventional definition of a “test case” [3] in this work: a test case is one or more inputs (as needed) and the excepted output(s) (or behavior) for a unit or a system under test. For example, to test an absolute-value function, one would need at least two test cases: a test case with a positive integer, and another test case with a negative integer.

In addition to the test-case design phase, NLP techniques have also been used in other software testing activities, e.g., in the context of the test oracle problem, e.g., [4].

To improve the efficiency of software testing, many NLP-based techniques and tools have been proposed in the last decades. We use the phrase “NLP-assisted software testing” in this paper to refer to all NLP-based techniques and tools which could assist any software testing activity, e.g., test-case design and test evaluation, as discussed above.

Given the growingbody of knowledge in the area of NLP-assisted software testing, reviewing and getting an overview of the entire state-of-the-art and -practice in this area is challenging for a practitioner or a (new) researcher. As discussed above, practitioners are eager to get help from any (partially) automated approach to help them save time in extracting tests from requirements [1]. Knowing that they can adapt/customize an existing technique to predict and improve software testing in their own context can potentially help companies and test engineers bring more efficiency into their software testing practices. Thus, we have observed first-hand that there is a real need for review papers like the current one to provide a summary of the entire field and serve as an “index” to the body of knowledge in this area, so that a practitioner can get a snapshot of the current knowledge without having to find and read through all of the papers in this area. Furthermore, a recent insightful paper in IEEE Software [5] highlighted “the practical value of historical data[and approaches published in the past]”and a "vicious cycle of inflation of software engineering terms and knowledge" (due to many papers not adequately reviewing the state of art). We believe survey papers like the current one aim at addressing the above problem.

To systematically review and get an overview of studies in a given research area, Systematic Literature Review (SLR) and Systematic (Literature) Mapping (SLM or SM) studies are the established approaches. To address the above need and to find out what we, as a community, know about NLP-assisted software testing, we report in this paper a SLM in this area. Our review pool included 67 academic peer-reviewed papers. The first paper in this area was published in 2001 and this review study includes all the papers until end of 2017. A few review (survey) papers have been previously published in this area, e.g., [6,7], but their review pools were somewhat limited as the largest paper pool size amounted to 16 papers (in [7]). As we discuss in Section 2.3, our survey is the most up-to-date and comprehensive review in the area by considering all the 67 papers, that we have found, published in this area between 2001–2017.

The remainder of this paper is structured as follows. Background and related work is presented in Section 2. We describe the research method and the planning phase of our review in Section 3. Section 4 presents the results of the literature review. Section 5 summarizes the findings and potential benefits of this review. Finally, in Section 6, we draw conclusions, and suggest areas for further research. In the appendix, we show the list of the primary studies reviewed in this survey.

Section snippets

Background and related work

In this section, we first provide a brief overview of the concept of NLP, followed by an overview of NLP-assisted software testing. We then review the related work, which are the existing survey (review) papers on NLP-assisted software testing.

Research method (planning of the systematic review)

Based on our past experience in SLR and SLM studies, e.g., [21], and also using the established guidelines for conducting SLR and SLM studies in SE (e.g., [22], [23], [24], [25]), we developed our review process, as shown in Fig. 3. We discuss the planning and design phases of our review in the next sections.

Results

This section presents results of the study's RQs. The section is structured according to the three groups of RQs:

  • Group 1–Common aspects in all review studies (Classification of studies by contribution and research method types)

  • Group 2-Technical issues specific to the topic (NLP-assisted software testing)

  • Group 3-Specific to empirical and case studies

Discussions

We provide a summary of findings and implications of our results. We then assess benefits of this review study, and discuss potential threats to validity.

Conclusion and future work

By classifying the state-of-the-art and the –practice, this survey paper mapped and reviewed the body of knowledge on NLP-assisted software testing. We systematically reviewed 67 papers in this area and classified them. By summarizing what we know in this area, this paper provides an “index” to the vast body of knowledge in this area. Practitioners and researchers who are interested in reading each of the classified studies in depth, can conveniently use the online Google spreadsheet at

Declaration of Competing Interest

The authors declare that they have NO known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (56)

  • P. Ammann et al.

    Introduction to Software Testing

    (2008)
  • A. Goffi et al.

    Automatic generation of oracles for exceptional behaviors

  • Z. Obrenovic

    Insights from the Past: the IEEE Software History Experiment

    IEEE Softw.

    (2017)
  • I. Ahsan et al.

    A comprehensive investigation of natural language processing techniques and tools to generate automated test cases

  • E.D. Liddy

    Natural language processing

  • D.Y. Lee

    Corpora and discourse analysis

  • P. Koehn

    Statistical Machine Translation

    (2009)
  • L. Berti-Equille et al.

    Veracity of Data: from Truth Discovery Computation Algorithms to Models of Misinformation Dynamics

    Synthesis Lect. Data Manage.

    (2015)
  • J.L. Leidner

    Current issues in software engineering for natural language processing

  • C.D. Manning et al.

    The stanford core-NLP natural language processing toolkit

  • D. Graham et al.

    Experiences of Test Automation: Case Studies of Software Test Automation

    (2012)
  • V. Garousi et al.

    Test automation: not just for test execution

    IEEE Softw.

    (2017)
  • M. Shahbaz et al.

    Automated discovery of valid test strings from the web using dynamic regular expressions collation and natural language processing

  • M. Zhang et al.

    A systematic approach to automatically derive test cases from use cases specified in restricted natural languages

  • C. Denger et al.

    Test case derived from requirement specification

    Fraunhofer IESE Tech. Rep.,

    (2003)
  • J.J. Gutiérrez Rodríguez et al.

    Generation of test cases from functional requirements. A survey

  • F. Nazir et al.

    The Applications of Natural Language Processing (NLP) for Software Requirement Engineering - A Systematic Literature Review

  • V Garousi et al.

    Testing Embedded Software: A Survey of the Literature

    (2018)
  • Cited by (46)

    • Early analysis of requirements using NLP and Petri-nets

      2024, Journal of Systems and Software
    • Automatic creation of acceptance tests by extracting conditionals from requirements: NLP approach and case study

      2023, Journal of Systems and Software
      Citation Excerpt :

      Consequently, practitioners need to strike a balance between full test coverage and number of required test cases. Creating acceptance tests is a predominantly manual task due to insufficient tool support (Garousi et al., 2020). Most of the existing approaches allow the derivation of test cases from semi-formal requirements (Wang et al., 2020; Carvalho et al., 2014; Barros et al., 2011) (e.g., expressed in controlled natural language) or formal requirements (Liu and Nakajima, 2020; Sharma and Biswas, 2014) (e.g., expressed in linear temporal logic), but are not suitable to process informal requirements.

    • An analytical code quality methodology using Latent Dirichlet Allocation and Convolutional Neural Networks

      2022, Journal of King Saud University - Computer and Information Sciences
      Citation Excerpt :

      They also presented a reference to extended knowledge and experience on teaching software testing and they assisted researchers to realize better training in this field to plan and deliver their software testing courses effectively, or to perform more education-related research. Garousi et al. (2020) offered an overview of the contribution types introduced in the papers, types of NLP approaches used to use to aid in software testing, types of input requirements required, and survey of tool support in this field. Some of the main findings they explored were: (1) only (11%) of tools in the papers are ready for download; (2) a great proportion of the papers (30 of 67) offered a surface detection to the NLP sides.

    View all citing articles on Scopus
    View full text