Regression test case prioritization by code combinations coverage

doi:10.1016/j.jss.2020.110712

Journal of Systems and Software

Volume 169, November 2020, 110712

https://doi.org/10.1016/j.jss.2020.110712 Get rights and content

Highlights

•
We propose a new coverage criterion, code combinations coverage.
•
We propose a new approach, code combinations coverage based prioritization (CCCP).
•
We report on empirical studies to investigate the performance of CCCP.
•
We provide some practical guidelines for testers when using CCCP.

Abstract

Regression test case prioritization (RTCP) aims to improve the rate of fault detection by executing more important test cases as early as possible. Various RTCP techniques have been proposed based on different coverage criteria. Among them, a majority of techniques leverage code coverage information to guide the prioritization process, with code units being considered individually, and in isolation. In this paper, we propose a new coverage criterion, code combinations coverage, that combines the concepts of code coverage and combination coverage. We apply this coverage criterion to RTCP, as a new prioritization technique, code combinations coverage based prioritization (CCCP). We report on empirical studies conducted to compare the testing effectiveness and efficiency of CCCP with four popular RTCP techniques: total, additional, adaptive random, and search-based test prioritization. The experimental results show that even when the lowest combination strength is assigned, overall, the CCCP fault detection rates are greater than those of the other four prioritization techniques. The CCCP prioritization costs are also found to be comparable to the additional test prioritization technique. Moreover, our results also show that when the combination strength is increased, CCCP provides higher fault detection rates than the state-of-the-art, regardless of the levels of code coverage.

Introduction

Modern software systems continuously evolve due to the fixing of detected bugs, the adding of new functionalities, and the refactoring of system architecture. Regression testing is conducted to ensure that the changed source code does not introduce new defects. However, it can become expensive to run an entire regression test suite because its size naturally increases during software maintenance and evolution: In an industrial case reported by Rothermel et al. (1999), for example, the execution time for running the entire test suite could become several weeks.

Regression test case prioritization (RTCP) has become one of the most effective approaches to reduce the overheads in regression testing (Li et al., 2007, Jiang et al., 2009, Mei et al., 2012, Saha et al., 2015, Ledru et al., 2009). RTCP techniques reorder the execution sequence of regression test cases, aiming to execute those test cases more likely to detect faults (according to some award function) as early as possible (Hao et al., 2016, Hao et al., 2014, Zhang et al., 2009).

Traditional RTCP techniques (Rothermel et al., 1999, Zhang et al., 2013a, Wang et al., 2017) usually use code coverage criteria to guide the prioritization process. Intuitively speaking, a code coverage criterion indicates the percentage of some code units (e.g. statements) covered by a test case. The expectation is that test cases with higher code coverage value have a greater chance of detecting faults (Zhu et al., 1997). Because of this, a goal of maximizing code coverage has been incorporated into various RTCP techniques, including greedy strategies (Rothermel et al., 1999). Given a coverage criterion (e.g., method, branch, or statement coverage), the total strategy selects the next test case with greatest absolute coverage, whereas the additional strategy selects the one with greatest coverage of code units not already covered by the prioritized test cases. Furthermore, Li et al. (2007) proposed two search-based RTCP techniques (a hill-climbing strategy and a genetic strategy) to explore the search space (the set of all permutations of the test cases) to find a sequence with a better fault detection rate. Jiang et al. (2009) investigated adaptive random techniques (Huang et al., 2019) to prioritize test cases using code coverage criteria. In an attempt to bridge the gap between the two greedy strategies, Zhang et al. (2013a) proposed a unified approach based on the fault detection probability for each test case (referred to as a p value).

In this paper, we propose a new coverage criterion, code combinations coverage, that combines the concepts of code coverage (Zhu et al., 1997) and combination coverage (Nie and Leung, 2011): Given a set of regression test cases $T$ , each test case is first transferred to an equally-sized tuple. Each position in this tuple is a binary value representing whether the corresponding item (such as branch, statement, or method) is covered by this test case. In other words, $T$ is represented by a set of abstract test cases with binary values $T^{'}$ . The code combinations coverage of $T$ is measured by the traditional combination coverage of $T^{'}$ . We apply this new coverage criterion to RTCP, proposing a new prioritization technique: code combinations coverage based prioritization (CCCP).

We conducted empirical studies on 14 versions of four Java programs, and 30 versions of five real-world Unix utility programs. Our goal was to investigate the testing effectiveness and efficiency of CCCP compared with four widely-used RTCP techniques — total, additional, adaptive random, and search-based test prioritization. The results show that when the lowest combination strength is assigned, overall, our approach has better fault detection rates than the other four test prioritization techniques. It not only achieves comparable testing efficiency to additional, but also requires much less prioritization time than the adaptive random and search-based techniques. In addition, while the code coverage granularity does not impact on the testing effectiveness of CCCP, the test case granularity does significantly impact on it. Furthermore, when the combination strength is increased, CCCP provides better fault detection rates than all other RTCP techniques, regardless of the level of code coverage.

The main contributions of this paper are:

•
We propose a new coverage criterion called code combinations coverage that combines the concepts of code coverage and combination coverage.
•
We apply code combinations coverage to RTCP, leading to a new prioritization technique called code combinations coverage based prioritization (CCCP).
•
We report on empirical studies conducted to investigate the test effectiveness and efficiency of CCCP compared to four widely-used prioritization techniques, and also analyze the impact of code coverage granularity and test case granularity on the effectiveness of CCCP.
•
We provide some guidelines for how to choose the combination strength and code-coverage level for CCCP, under different testing scenarios.

The rest of this paper is organized as follows: Section 2 presents some background information. Section 3 introduces the proposed approach. Section 4 presents the research questions, and explains details of the empirical study. Section 5 provides the detailed results of the study and answers the research questions. Section 6 discusses some related work, and Section 7 concludes this paper, including highlighting some potential future work.

Section snippets

Background

In this section, we provide some background information about abstract test cases and test case prioritization.

Approach

In this section, we introduce the details of test case prioritization by code combinations coverage.

Empirical study

In this section, we present our empirical study, including the research questions underlying the study. We also discuss some independent and dependent variables, and explain the subject programs, test suites, and experimental setup in detail.

Results and analysis

This section presents the experimental results to answer the research questions.

To answer RQ1 to RQ4, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8 present box plots of the distribution of the APFD or APFD $_{c}$ values (averaged over 1000 iterations). Each box plot shows the mean (square in the box), median (line in the box), and upper and lower quartiles (25th and 75th percentile) for the APFD or APFD $_{c}$ values for the RTCP techniques. Statistical analyses are also provided in Table 3, Table

Related work

A considerable amount of research has been conducted into regression testing techniques with a goal of improving the testing performance. This includes test case prioritization (Rothermel et al., 1999, Miranda et al., 2018), reduction (Chen et al., 2017, Shi et al., 2015) and selection (Zhang, 2018, Gligoric et al., 2015a). This Related Work section focuses on test case prioritization, which aims to detect faults as early as possible through the reordering of regression test cases (Yoo and

Conclusions and future work

In this paper, we have introduced a new coverage criterion that combines the concepts of code and combination coverage. Based on this, we proposed a new prioritization technique, code combinations coverage based prioritization (CCCP). Results from our empirical studies have demonstrated that CCCP with the lowest combination strength ( $λ = 1$ ) can achieve better fault detection rates than four well-known, popular prioritization techniques (total, additional, adaptive random, and search-based test

CRediT authorship contribution statement

Rubing Huang: Conceptualization, Methodology, Writing - review & editing, Investigation. Quanjun Zhang: Software, Data curation, Writing - original draft. Dave Towey: Writing - review & editing, Validation. Weifeng Sun: Visualization, Formal analysis. Jinfu Chen: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We would like to thank the anonymous reviewers for their many constructive comments. We would also like to thank Christopher Henard for providing us the fault data for the five C subject programs. This work is supported by the National Natural Science Foundation of China under grant nos. 61502205, 61872167, and U1836116, the project funded by China Postdoctoral Science Foundation under grant no. 2019T120396, and the Senior Personnel Scientific Research Foundation of Jiangsu University under

References (60)

KaminskiG. et al.
Improving logic-based testing
J. Syst. Softw.
(2013)
KhatibsyarbiniM. et al.
Test case prioritization approaches in regression testing: A systematic literature review
Inf. Softw. Technol.
(2018)
Andrews, J.H., Briand, L.C., Labiche, Y., 2005. Is mutation an appropriate tool for testing experiments? In:...
AndrewsJ.H. et al.
Using mutation analysis for assessing and comparing testing coverage criteria
IEEE Trans. Softw. Eng.
(2006)
ArcuriA. et al.
A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering
Softw. Test. Verif. Reliab.
(2014)
Chen, J., Bai, Y., Hao, D., Zhang, L., Zhang, L., Xie, B., 2017. How do assertions impact coverage-based test-suite...
Chi, J., Qu, Y., Zheng, Q., Yang, Z., Jin, W., Cui, D., Liu, T., 2018. Test case prioritization based on method call...
DoH. et al.
Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact
Empir. Softw. Eng.
(2005)
DoH. et al.
The effects of time constraints on test case prioritization: A series of controlled experiments
IEEE Trans. Softw. Eng.
(2010)
Do, H., Rothermel, G., 2005. A controlled experiment assessing test case prioritization techniques via mutation faults....

DoH. et al.

Prioritizing JUnit test cases: An empirical assessment and cost-benefits analysis

Empir. Softw. Eng.

(2006)

EghbaliS. et al.

Test case prioritization using lexicographical ordering

IEEE Trans. Softw. Eng.

(2016)

Elbaum, S., Malishevsky, A.G., Rothermel, G., 2000. Prioritizing test cases for regression testing. In: Proceedings of...

Elbaum, S., Malishevsky, A., Rothermel, G., 2001. Incorporating varying test costs and fault severities into test case...

Epitropakis, M.G., Yoo, S., Harman, M., Burke, E.K., 2015. Empirical evaluation of pareto efficient multi-objective...

Free Software Foundation, 2017. GNU FTP Server,...

Free Software Foundation, 2019. gcov: A Test coverage program,...

GCC team, 2019. gcc: The GNU Compiler Collection,...

Gligoric, M., Eloussi, L., Marinov, D., 2015a. Practical regression test selection with dynamic file dependencies. In:...

GligoricM. et al.

Guidelines for coverage-based comparisons of non-adequate test suites

ACM Trans. Softw. Eng. Methodol.

(2015)

GrindalM. et al.

An evaluation of combination strategies for test case selection

Empir. Softw. Eng.

(2006)

HaoD. et al.

To be optimal or not in test-case prioritization

IEEE Trans. Softw. Eng.

(2016)

HaoD. et al.

A unified test case prioritization approach

ACM Trans. Softw. Eng. Methodol.

(2014)

Henard, C., Papadakis, M., Harman, M., Jia, Y., Le Traon, Y., 2016. Comparing white-box and black-box test...

HuangR. et al.

A survey on adaptive random testing

IEEE Trans. Softw. Eng.

(2019)

Jiang, B., Zhang, Z., Chan, W.K., Tse, T., 2009. Adaptive random test case prioritization. In: Proceedings of the 24th...

JonesJ.A. et al.

Test-suite reduction and prioritization for modified condition/decision coverage

IEEE Trans. Softw. Eng.

(2003)

Just, R., Jalali, D., Inozemtseva, L., Ernst, M.D., Holmes, R., Fraser, G., 2014. Are mutants a valid substitute for...

Just, R., Kapfhammer, G.M., Schweiggert, F., 2012. Do redundant mutants affect the effectiveness and efficiency of...

Ledru, Y., Petrenko, A., Boroday, S., 2009. Using string distances for test case prioritisation. In: Proceedings of the...

Cited by (0)

Rubing Huang received the Ph.D. degree in computer science and technology from the Huazhong University of Science and Technology, Wuhan, China, in 2013. From 2016 to 2018, he was a visiting scholar at Swinburne University of Technology and at Monash University, Australia. He is an associate professor in the Department of Software Engineering, School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China. His current research interests include software testing (including adaptive random testing, random testing, combinatorial testing, and regression testing), debugging, and maintenance. He has more than 50 publications in journals and proceedings, including in IEEE Transactions on Software Engineering, IEEE Transactions on Reliability, Journal of Systems and Software, Information and Software Technology, IET Software, The Computer Journal, International Journal of Software Engineering and Knowledge Engineering, ICSE, ICST, COMPSAC, QRS, SEKE, and SAC. He is a senior member of the IEEE and the China Computer Federation, and a member of the ACM. More about him and his work is available online at https://huangrubing.github.io/.

Quanjun Zhang received the B.Eng. degree in computer science and technology in 2017 from Jiangsu University, Zhenjiang, China, where he is currently working toward the M.Eng. degree with the School of Computer Science and Communication Engineering. His current research interests include software testing and software maintenance.

Dave Towey received the B.A. and M.A. degrees in computer science, linguistics, and languages from the University of Dublin, Trinity College, Ireland; the M.Ed. degree in education leadership from the University of Bristol, U.K.; and the Ph.D. degree in computer science from The University of Hong Kong, China. He is an associate professor at University of Nottingham Ningbo China (UNNC), in Zhejiang, China, where he serves as the director of teaching and learning, and deputy head of school, for the School of Computer Science. He is also the deputy director of the International Doctoral Innovation Center at UNNC. He is a member of the UNNC Artificial Intelligence and Optimization research group. His current research interests include software testing (especially adaptive random testing, for which he was amongst the earliest researchers who established the field, and metamorphic testing), computer security, and technology-enhanced education. He co-founded the ICSE International Workshop on Metamorphic Testing in 2016. He is a member of both the IEEE and the ACM.

Weifeng Sun received the B.Eng. degree in computer science and technology in 2018 from Jiangsu University, Zhenjiang, China, where he is currently working toward the M.Eng. degree with the School of Computer Science and Communication Engineering. His current research interests include software testing and software debugging. His work has been published in journals and proceedings, including in IEEE Transactions on Software Engineering, IEEE Transactions on Reliability, and the IEEE International Conference on Software Testing, Verification and Validation (ICST). He is a student member of the China Computer Federation and the ACM.

Jinfu Chen received the BE degree in 2004 from Nanchang Hangkong University, Nanchang, China and the Ph.D. degree in 2009 from Huazhong University of Science and Technology, Wuhan, China, both in computer science. He is currently a full professor in the School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China. His major research interests include software testing, software analysis, and trusted software.

View full text

In practiceRegression test case prioritization by code combinations coverage

Highlights

Abstract

Introduction

Section snippets

Background

Approach

Empirical study

Results and analysis

Related work

Conclusions and future work

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

J. Syst. Softw.

Inf. Softw. Technol.

Using mutation analysis for assessing and comparing testing coverage criteria

IEEE Trans. Softw. Eng.

A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering

Softw. Test. Verif. Reliab.

Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact

Empir. Softw. Eng.

The effects of time constraints on test case prioritization: A series of controlled experiments

IEEE Trans. Softw. Eng.

Prioritizing JUnit test cases: An empirical assessment and cost-benefits analysis

Empir. Softw. Eng.

Test case prioritization using lexicographical ordering

IEEE Trans. Softw. Eng.

Guidelines for coverage-based comparisons of non-adequate test suites

ACM Trans. Softw. Eng. Methodol.

An evaluation of combination strategies for test case selection

Empir. Softw. Eng.

To be optimal or not in test-case prioritization

IEEE Trans. Softw. Eng.

A unified test case prioritization approach

ACM Trans. Softw. Eng. Methodol.

A survey on adaptive random testing

IEEE Trans. Softw. Eng.

Test-suite reduction and prioritization for modified condition/decision coverage

IEEE Trans. Softw. Eng.

In practice
Regression test case prioritization by code combinations coverage