JUGE: An Infrastructure for Benchmarking Java Unit Test Generators,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

JUGE: An Infrastructure for Benchmarking Java Unit Test Generators
arXiv - CS - Software Engineering Pub Date : 2021-06-14 , DOI: arxiv-2106.07520
Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti, René Just, Fitsum Kifetew, Annibale Panichella, Sebastiano Panichella

Researchers and practitioners have designed and implemented various automated test case generators to support effective software testing. Such generators exist for various languages (e.g., Java, C#, or Python) and for various platforms (e.g., desktop, web, or mobile applications). Such generators exhibit varying effectiveness and efficiency, depending on the testing goals they aim to satisfy (e.g., unit-testing of libraries vs. system-testing of entire applications) and the underlying techniques they implement. In this context, practitioners need to be able to compare different generators to identify the most suited one for their requirements, while researchers seek to identify future research directions. This can be achieved through the systematic execution of large-scale evaluations of different generators. However, the execution of such empirical evaluations is not trivial and requires a substantial effort to collect benchmarks, setup the evaluation infrastructure, and collect and analyse the results. In this paper, we present our JUnit Generation benchmarking infrastructure (JUGE) supporting generators (e.g., search-based, random-based, symbolic execution, etc.) seeking to automate the production of unit tests for various purposes (e.g., validation, regression testing, fault localization, etc.). The primary goal is to reduce the overall effort, ease the comparison of several generators, and enhance the knowledge transfer between academia and industry by standardizing the evaluation and comparison process. Since 2013, eight editions of a unit testing tool competition, co-located with the Search-Based Software Testing Workshop, have taken place and used and updated JUGE. As a result, an increasing amount of tools (over ten) from both academia and industry have been evaluated on JUGE, matured over the years, and allowed the identification of future research directions.

中文翻译：

JUGE：用于对 Java 单元测试生成器进行基准测试的基础设施

研究人员和从业人员设计并实现了各种自动化测试用例生成器，以支持有效的软件测试。此类生成器适用于各种语言（例如 Java、C# 或 Python）和各种平台（例如桌面、Web 或移动应用程序）。这些生成器表现出不同的有效性和效率，这取决于它们旨在满足的测试目标（例如，库的单元测试与整个应用程序的系统测试）以及它们实现的底层技术。在这种情况下，从业者需要能够比较不同的生成器，以确定最适合他们要求的生成器，而研究人员则寻求确定未来的研究方向。这可以通过系统地执行对不同发电机的大规模评估来实现。然而，执行此类实证评估并非易事，需要付出大量努力来收集基准、设置评估基础设施以及收集和分析结果。在本文中，我们展示了我们的 JUnit 生成基准测试基础设施 (JUGE)，支持生成器（例如，基于搜索、基于随机、符号执行等），这些生成器旨在为各种目的（例如验证、回归测试、故障定位等）。主要目标是通过标准化评估和比较过程，减少整体工作量，简化多个生成器的比较，并加强学术界和工业界之间的知识转移。自 2013 年以来，与基于搜索的软件测试研讨会共同举办了八届单元测试工具竞赛，已经发生并使用和更新了 JUGE。因此，越来越多的学术界和工业界的工具（超过 10 个）已经在 JUGE 上进行了评估，经过多年的成熟，并允许确定未来的研究方向。

更新日期：2021-06-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文