Empirical Study of Restarted and Flaky Builds on Travis CI,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Empirical Study of Restarted and Flaky Builds on Travis CI
arXiv - CS - Software Engineering Pub Date : 2020-03-26 , DOI: arxiv-2003.11772
Thomas Durieux, Claire Le Goues, Michael Hilton and Rui Abreu

Continuous Integration (CI) is a development practice where developers frequently integrate code into a common codebase. After the code is integrated, the CI server runs a test suite and other tools to produce a set of reports (e.g., output of linters and tests). If the result of a CI test run is unexpected, developers have the option to manually restart the build, re-running the same test suite on the same code; this can reveal build flakiness, if the restarted build outcome differs from the original build. In this study, we analyze restarted builds, flaky builds, and their impact on the development workflow. We observe that developers restart at least 1.72% of builds, amounting to 56,522 restarted builds in our Travis CI dataset. We observe that more mature and more complex projects are more likely to include restarted builds. The restarted builds are mostly builds that are initially failing due to a test, network problem, or a Travis CI limitations such as execution timeout. Finally, we observe that restarted builds have a major impact on development workflow. Indeed, in 54.42% of the restarted builds, the developers analyze and restart a build within an hour of the initial failure. This suggests that developers wait for CI results, interrupting their workflow to address the issue. Restarted builds also slow down the merging of pull requests by a factor of three, bringing median merging time from 16h to 48h.

中文翻译：

在 Travis CI 上重新启动和片状构建的实证研究

持续集成 (CI) 是一种开发实践，开发人员经常将代码集成到公共代码库中。代码集成后，CI 服务器运行测试套件和其他工具来生成一组报告（例如，linter 和测试的输出）。如果 CI 测试运行的结果出乎意料，开发人员可以选择手动重新启动构建，在相同的代码上重新运行相同的测试套件；如果重新启动的构建结果与原始构建不同，这可以揭示构建脆弱性。在本研究中，我们分析了重新启动的构建、不稳定的构建及其对开发工作流程的影响。我们观察到开发人员重新启动了至少 1.72% 的构建，在我们的 Travis CI 数据集中总共有 56,522 个重新启动的构建。我们观察到，更成熟和更复杂的项目更有可能包括重新启动的构建。重新启动的构建大多是最初由于测试、网络问题或 Travis CI 限制（例如执行超时）而失败的构建。最后，我们观察到重新启动的构建对开发工作流程有重大影响。事实上，在 54.42% 的重新启动构建中，开发人员在初始失败后的一小时内分析并重新启动构建。这表明开发人员等待 CI 结果，中断他们的工作流程来解决问题。重新启动的构建还将拉取请求的合并速度减慢了三倍，使合并时间的中位数从 16 小时缩短到 48 小时。开发人员在初始失败后的一小时内分析并重新启动构建。这表明开发人员等待 CI 结果，中断他们的工作流程来解决问题。重新启动的构建还将拉取请求的合并速度减慢了三倍，使合并时间的中位数从 16 小时缩短到 48 小时。开发人员在初始失败后的一小时内分析并重新启动构建。这表明开发人员等待 CI 结果，中断他们的工作流程来解决问题。重新启动的构建还将拉取请求的合并速度减慢了三倍，使合并时间的中位数从 16 小时缩短到 48 小时。

更新日期：2020-03-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>