Bug Severity Prediction Using Question-and-Answer Pairs from Stack Overflow,Journal of Systems and Software

当前位置： X-MOL 学术 › J. Syst. Softw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Bug Severity Prediction Using Question-and-Answer Pairs from Stack Overflow
Journal of Systems and Software ( IF 3.7 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.jss.2020.110567
Youshuai Tan , Sijie Xu , Zhaowei Wang , Tao Zhang , Zhou Xu , Xiapu Luo

Abstract Nowadays, bugs have been common in most software systems. For large-scale software projects, developers usually conduct software maintenance tasks by utilizing software artifacts (e.g., bug reports). The severity of bug reports describes the impact of the bugs and determines how quickly it needs to be fixed. Bug triagers often pay close attention to some features such as severity to determine the importance of bug reports and assign them to the correct developers. However, a large number of bug reports submitted every day increase the workload of developers who have to spend more time on fixing bugs. In this paper, we collect question-and-answer pairs from Stack Overflow and use logical regression to predict the severity of bug reports. In detail, we extract all the posts related to bug repositories from Stack Overflow and combine them with bug reports to obtain enhanced versions of bug reports. We achieve severity prediction on three popular open source projects (e,g., Mozilla, Ecplise, and GCC) with Naive Bayesian, k-Nearest Neighbor algorithm (KNN), and Long Short-Term Memory (LSTM). The results of our experiments show that our model is more accurate than the previous studies for predicting the severity. Our approach improves by 23.03%, 21.86%, and 20.59% of the average F-measure for Mozilla, Eclipse, and GCC by comparing with the Naive Bayesian based approach which performs the best among all baseline approaches.

中文翻译：

使用 Stack Overflow 上的问答对进行错误严重程度预测

摘要如今，Bug 在大多数软件系统中都很常见。对于大型软件项目，开发人员通常利用软件工件（例如，错误报告）来执行软件维护任务。错误报告的严重性描述了错误的影响，并决定了修复它的速度。Bug 分类人员通常会密切关注一些特性，例如严重性，以确定 Bug 报告的重要性并将它们分配给正确的开发人员。但是，每天提交的大量错误报告增加了开发人员的工作量，他们不得不花更多的时间来修复错误。在本文中，我们从 Stack Overflow 收集问答对，并使用逻辑回归来预测错误报告的严重性。详细，我们从 Stack Overflow 中提取所有与 bug 存储库相关的帖子，并将它们与 bug 报告结合起来，以获得 bug 报告的增强版本。我们使用朴素贝叶斯、k-最近邻算法 (KNN) 和长短期记忆 (LSTM) 对三个流行的开源项目（例如 Mozilla、Ecplise 和 GCC）进行了严重性预测。我们的实验结果表明，我们的模型在预测严重性方面比以前的研究更准确。通过与所有基线方法中表现最佳的基于朴素贝叶斯的方法相比，我们的方法提高了 Mozilla、Eclipse 和 GCC 平均 F 度量的 23.03%、21.86% 和 20.59%。和 GCC) 与朴素贝叶斯、k-最近邻算法 (KNN) 和长短期记忆 (LSTM)。我们的实验结果表明，我们的模型在预测严重性方面比以前的研究更准确。通过与所有基线方法中表现最佳的基于朴素贝叶斯的方法相比，我们的方法提高了 Mozilla、Eclipse 和 GCC 平均 F 度量的 23.03%、21.86% 和 20.59%。和 GCC) 与朴素贝叶斯、k-最近邻算法 (KNN) 和长短期记忆 (LSTM)。我们的实验结果表明，我们的模型在预测严重性方面比以前的研究更准确。通过与所有基线方法中表现最佳的基于朴素贝叶斯的方法相比，我们的方法提高了 Mozilla、Eclipse 和 GCC 平均 F 度量的 23.03%、21.86% 和 20.59%。

更新日期：2020-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11