Traceability recovery between bug reports and test cases-a Mozilla Firefox case study,Automated Software Engineering

当前位置： X-MOL 学术 › Automat. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Traceability recovery between bug reports and test cases-a Mozilla Firefox case study
Automated Software Engineering ( IF 2.0 ) Pub Date : 2021-07-07 , DOI: 10.1007/s10515-021-00287-w
Guilherme Gadelha ₁ , Franklin Ramalho ₁ , Tiago Massoni ₁

Affiliation

Automatic recovery of traceability between software artifacts may promote early detection of issues and better calculate change impact. Information Retrieval (IR) techniques have been proposed for the task, but they differ considerably in input parameters and results. It is difficult to assess results when those techniques are applied in isolation, usually in small or medium-sized software projects. Recently, multilayered approaches to machine learning, in special Deep Learning (DL), have achieved success in text classification through their capacity to model complex relationships among data. In this article, we apply several IR and DL techniques for investing automatic traceability between bug reports and manual test cases, using historical data from the Mozilla Firefox’s Quality Assurance (QA) team. In this case study, we assess the following IR techniques: LSI, LDA, and BM25, in addition to a DL architecture called Convolutional Neural Networks (CNNs), through the use of Word Embeddings. In this context of traceability, we observe poor performances from three out of the four studied techniques. Only the LSI technique presented acceptable results, standing out even over the state-of-the-art BM25 technique. The obtained results suggest that the semi-automatic application of the LSI technique – with an appropriate combination of thresholds – may be feasible for real-world software projects.

中文翻译：

错误报告和测试用例之间的可追溯性恢复——一个 Mozilla Firefox 案例研究

软件工件之间可追溯性的自动恢复可以促进问题的早期检测并更好地计算变更影响。已经为该任务提出了信息检索 (IR) 技术，但它们在输入参数和结果方面存在很大差异。当这些技术单独应用时，很难评估结果，通常是在中小型软件项目中。最近，在特殊深度学习 (DL) 中，机器学习的多层方法通过对数据之间复杂关系进行建模的能力在文本分类方面取得了成功。在本文中，我们使用来自 Mozilla Firefox 质量保证 (QA) 团队的历史数据，应用多种 IR 和 DL 技术来投资错误报告和手动测试用例之间的自动可追溯性。在这个案例研究中，我们通过使用 Word Embeddings 评估了以下 IR 技术：LSI、LDA 和 BM25，以及称为卷积神经网络 (CNN) 的 DL 架构。在这种可追溯性的背景下，我们观察到四种研究技术中的三种表现不佳。只有 LSI 技术提供了可接受的结果，甚至超过了最先进的 BM25 技术。获得的结果表明 LSI 技术的半自动应用 - 具有适当的阈值组合 - 对于现实世界的软件项目可能是可行的。只有 LSI 技术提供了可接受的结果，甚至超过了最先进的 BM25 技术。获得的结果表明 LSI 技术的半自动应用 - 具有适当的阈值组合 - 对于现实世界的软件项目可能是可行的。只有 LSI 技术提供了可接受的结果，甚至超过了最先进的 BM25 技术。获得的结果表明 LSI 技术的半自动应用 - 具有适当的阈值组合 - 对于现实世界的软件项目可能是可行的。

更新日期：2021-07-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11