当前位置: X-MOL 学术Inf. Softw. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the prediction of long-lived bugs: An analysis and comparative study using FLOSS projects
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-12-28 , DOI: 10.1016/j.infsof.2020.106508
Luiz Alberto Ferreira Gomes , Ricardo da Silva Torres , Mario Lúcio Côrtes

Context:

Software evolution and maintenance activities in today’s Free/Libre Open Source Software (FLOSS) rely primarily on information extracted from bug reports registered in bug tracking systems. Many studies point out that most bugs that adversely affect the user’s experience across versions of FLOSS projects are long-lived bugs. However, proposed approaches that support bug fixing procedures do not consider the real-world lifecycle of a bug, in which bugs are often fixed very fast. This may lead to useless efforts to automate the bug management process.

Objective:

This study aims to confirm whether the number of long-lived bugs is significantly high in popular open-source projects and to characterize the population of long-lived bugs by considering the attributes of bug reports. We also aim to conduct a comparative study evaluating the prediction accuracy of five well-known machine learning algorithms and text mining techniques in the task of predicting long-lived bugs.

Methods:

We collected bug reports from six popular open-source projects repositories (Eclipse, Freedesktop, Gnome, GCC, Mozilla, and WineHQ) and used the following machine learning algorithms to predict long-lived bugs: K-Nearest Neighbor, Naïve Bayes, Neural Networks, Random Forest, and Support Vector Machines.

Results:

Our results show that long-lived bugs are relatively frequent (varying from 7.2% to 40.7%) and have unique characteristics, confirming the need to study solutions to support bug fixing management. We found that the Neural Network classifier yielded the best results in comparison to the other algorithms evaluated.

Conclusion:

Research efforts regarding long-lived bugs are needed and our results demonstrate that it is possible to predict long-lived bugs with a high accuracy (around 70.7%) despite the use of simple prediction algorithms and text mining methods.



中文翻译:

关于长寿虫的预测:使用FLOSS项目的分析和比较研究

内容:

当今的免费/自由开源软件(FLOSS)中的软件开发和维护活动主要依赖于从在错误跟踪系统中注册的错误报告中提取的信息。许多研究指出,大多数对FLOSS项目的版本产生不利影响的bug都是长期存在的bug。但是,所提出的支持错误修复程序的方法并未考虑错误的真实生命周期,在这种情况下,通常会很快修复错误。这可能导致无用的努力来使错误管理过程自动化。

目的:

这项研究旨在确认流行的开源项目中长期存在的错误的数量是否很高,并通过考虑错误报告的属性来表征长期存在的错误的数量。我们还旨在进行比较研究,以评估五种著名的机器学习算法和文本挖掘技术在预测长寿错误方面的预测准确性。

方法:

我们从六个受欢迎的开源项目存储库(Eclipse,Freedesktop,Gnome,GCC,Mozilla和WineHQ)中收集了错误报告,并使用以下机器学习算法来预测长期存在的错误:K最近邻居,朴素贝叶斯,神经网络,随机森林和支持向量机。

结果:

我们的结果表明,长期存在的错误相对频繁(从7.2%到40.7%不等)并且具有独特的特征,这证实了需要研究解决方案以支持错误修复管理的需求。我们发现,与其他评估算法相比,神经网络分类器产生了最佳结果。

结论:

需要对长期存在的错误进行研究,我们的结果表明,尽管使用了简单的预测算法和文本挖掘方法,仍可以高精度(大约70.7%)预测长期存在的错误。

更新日期:2021-01-02
down
wechat
bug