On the feasibility of automated prediction of bug and non-bug issues,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On the feasibility of automated prediction of bug and non-bug issues
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2020-09-14 , DOI: 10.1007/s10664-020-09885-w
Steffen Herbold , Alexander Trautsch , Fabian Trautsch

Context Issue tracking systems are used to track and describe tasks in the development process, e.g., requested feature improvements or reported bugs. However, past research has shown that the reported issue types often do not match the description of the issue. Objective We want to understand the overall maturity of the state of the art of issue type prediction with the goal to predict if issues are bugs and evaluate if we can improve existing models by incorporating manually specified knowledge about issues. Method We train different models for the title and description of the issue to account for the difference in structure between these fields, e.g., the length. Moreover, we manually detect issues whose description contains a null pointer exception, as these are strong indicators that issues are bugs. Results Our approach performs best overall, but not significantly different from an approach from the literature based on the fastText classifier from Facebook AI Research. The small improvements in prediction performance are due to structural information about the issues we used. We found that using information about the content of issues in form of null pointer exceptions is not useful. We demonstrate the usefulness of issue type prediction through the example of labelling bugfixing commits. Conclusions Issue type prediction can be a useful tool if the use case allows either for a certain amount of missed bug reports or the prediction of too many issues as bug is acceptable.

中文翻译：

关于bug和非bug问题自动预测的可行性

上下文问题跟踪系统用于跟踪和描述开发过程中的任务，例如请求的功能改进或报告的错误。但是，过去的研究表明，报告的问题类型通常与问题的描述不符。目标我们想了解问题类型预测技术的整体成熟度，目的是预测问题是否是错误，并评估我们是否可以通过结合手动指定的问题知识来改进现有模型。方法我们针对问题的标题和描述训练不同的模型，以解释这些字段之间的结构差异，例如长度。此外，我们手动检测描述中包含空指针异常的问题，因为这些是问题是错误的有力指标。结果我们的方法总体上表现最好，但与基于 Facebook AI Research 的 fastText 分类器的文献中的方法没有显着差异。预测性能的小幅改进是由于我们使用的问题的结构信息。我们发现以空指针异常的形式使用有关问题内容的信息是没有用的。我们通过标记错误修复提交的示例展示了问题类型预测的有用性。结论如果用例允许遗漏一定数量的错误报告或预测过多的问题是可以接受的，那么问题类型预测可能是一个有用的工具。预测性能的小幅改进是由于我们使用的问题的结构信息。我们发现以空指针异常的形式使用有关问题内容的信息是没有用的。我们通过标记错误修复提交的示例展示了问题类型预测的有用性。结论如果用例允许遗漏一定数量的错误报告或预测过多的问题是可以接受的，那么问题类型预测可能是一个有用的工具。预测性能的小幅改进是由于我们使用的问题的结构信息。我们发现以空指针异常的形式使用有关问题内容的信息是没有用的。我们通过标记错误修复提交的示例展示了问题类型预测的有用性。结论如果用例允许遗漏一定数量的错误报告或预测过多的问题是可以接受的，那么问题类型预测可能是一个有用的工具。

更新日期：2020-09-14

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11