On the prediction of long-lived bugs: An analysis and comparative study using FLOSS projects,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On the prediction of long-lived bugs: An analysis and comparative study using FLOSS projects
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-12-28 , DOI: 10.1016/j.infsof.2020.106508
Luiz Alberto Ferreira Gomes , Ricardo da Silva Torres , Mario Lúcio Côrtes

Context:

Software evolution and maintenance activities in today’s Free/Libre Open Source Software (FLOSS) rely primarily on information extracted from bug reports registered in bug tracking systems. Many studies point out that most bugs that adversely affect the user’s experience across versions of FLOSS projects are long-lived bugs. However, proposed approaches that support bug fixing procedures do not consider the real-world lifecycle of a bug, in which bugs are often fixed very fast. This may lead to useless efforts to automate the bug management process.

Objective:

This study aims to confirm whether the number of long-lived bugs is significantly high in popular open-source projects and to characterize the population of long-lived bugs by considering the attributes of bug reports. We also aim to conduct a comparative study evaluating the prediction accuracy of five well-known machine learning algorithms and text mining techniques in the task of predicting long-lived bugs.

Methods:

We collected bug reports from six popular open-source projects repositories (Eclipse, Freedesktop, Gnome, GCC, Mozilla, and WineHQ) and used the following machine learning algorithms to predict long-lived bugs: K-Nearest Neighbor, Naïve Bayes, Neural Networks, Random Forest, and Support Vector Machines.

Results:

Our results show that long-lived bugs are relatively frequent (varying from 7.2% to 40.7%) and have unique characteristics, confirming the need to study solutions to support bug fixing management. We found that the Neural Network classifier yielded the best results in comparison to the other algorithms evaluated.

Conclusion:

Research efforts regarding long-lived bugs are needed and our results demonstrate that it is possible to predict long-lived bugs with a high accuracy (around 70.7%) despite the use of simple prediction algorithms and text mining methods.

中文翻译：

关于长寿虫的预测：使用FLOSS项目的分析和比较研究

内容：

当今的免费/自由开源软件（FLOSS）中的软件开发和维护活动主要依赖于从在错误跟踪系统中注册的错误报告中提取的信息。许多研究指出，大多数对FLOSS项目的版本产生不利影响的bug都是长期存在的bug。但是，所提出的支持错误修复程序的方法并未考虑错误的真实生命周期，在这种情况下，通常会很快修复错误。这可能导致无用的努力来使错误管理过程自动化。