Predicting continuous integration build failures using evolutionary search,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Predicting continuous integration build failures using evolutionary search
Information and Software Technology ( IF 3.8 ) Pub Date : 2020-08-18 , DOI: 10.1016/j.infsof.2020.106392
Islem Saidani , Ali Ouni , Moataz Chouchen , Mohamed Wiem Mkaouer

Context: Continuous Integration (CI) is a common practice in modern software development and it is increasingly adopted in the open-source as well as the software industry markets. CI aims at supporting developers in integrating code changes constantly and quickly through an automated build process. However, in such context, the build process is typically time and resource-consuming which requires a high maintenance effort to avoid build failure.

Objective: The goal of this study is to introduce an automated approach to cut the expenses of CI build time and provide support tools to developers by predicting the CI build outcome.

Method: In this paper, we address problem of CI build failure by introducing a novel search-based approach based on Multi-Objective Genetic Programming (MOGP) to build a CI build failure prediction model. Our approach aims at finding the best combination of CI built features and their appropriate threshold values, based on two conflicting objective functions to deal with both failed and passed builds.

Results: We evaluated our approach on a benchmark of 56,019 builds from 10 large-scale and long-lived software projects that use the Travis CI build system. The statistical results reveal that our approach outperforms the state-of-the-art techniques based on machine learning by providing a better balance between both failed and passed builds. Furthermore, we use the generated prediction rules to investigate which factors impact the CI build results, and found that features related to (1) specific statistics about the project such as team size, (2) last build information in the current build and (3) the types of changed files are the most influential to indicate the potential failure of a given build.

Conclusion: This paper proposes a multi-objective search-based approach for the problem of CI build failure prediction. The performances of the models developed using our MOGP approach were statistically better than models developed using machine learning techniques. The experimental results show that our approach can effectively reduce both false negative rate and false positive rate of CI build failures in highly imbalanced datasets.

中文翻译：

使用进化搜索预测持续集成构建失败

背景信息：持续集成（CI）是现代软件开发中的一种普遍做法，并且在开放源代码以及软件行业市场中越来越多地被采用。CI旨在通过自动构建过程支持开发人员持续不断地集成代码更改。但是，在这种情况下，构建过程通常会浪费时间和资源，这需要大量的维护工作才能避免构建失败。

目的：本研究的目的是引入一种自动化方法，以减少CI构建时间的支出，并通过预测CI构建结果为开发人员提供支持工具。

方法：在本文中，我们通过引入一种基于多目标遗传规划（MOGP）的新颖的基于搜索的方法来构建CI构建失败预测模型，解决了CI构建失败的问题。我们的方法旨在基于两个相互冲突的目标函数来找到CI构建功能及其适当的阈值的最佳组合，以处理失败和通过的构建。

结果：我们以使用Travis CI构建系统的10个大型且长期存在的软件项目为基准，以56,019个构建为基准评估了我们的方法。统计结果表明，通过在失败和通过的构建之间提供更好的平衡，我们的方法优于基于机器学习的最新技术。此外，我们使用生成的预测规则调查影响CI构建结果的因素，并发现与（1）有关项目的特定统计信息（例如团队规模），（2）当前构建中的最后构建信息和（3）相关的功能）更改的文件的类型最有影响力，以指示给定版本的潜在故障。

结论：针对CI构建失败的预测问题，本文提出了一种基于多目标搜索的方法。在统计学上，使用我们的MOGP方法开发的模型的性能优于使用机器学习技术开发的模型。实验结果表明，在高度不平衡的数据集中，我们的方法可以有效降低CI构建失败的误报率和误报率。

更新日期：2020-08-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11