当前位置: X-MOL 学术Softw. Test. Verif. Reliab. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Can operational profile coverage explain post‐release bug detection?
Software Testing, Verification and Reliability ( IF 1.5 ) Pub Date : 2020-07-07 , DOI: 10.1002/stvr.1735
Lucas Andrade 1 , Patrícia Machado 1 , Wilkerson Andrade 1
Affiliation  

To deliver reliable software, developers may rely on the fault detection capability of test suites. To evaluate this capability, they can apply code coverage metrics before a software release. However, recent research results have shown that these metrics may not provide a solid basis for this evaluation. Moreover, the fixing of a fault has a cost, and not all faults have the same impact regarding software reliability. In this sense, operational testing aims at assessing parts of the system that are more valuable for users. The goal of this work is to investigate whether traditional code coverage and code coverage merged with operational information can be related to post‐release bug detection. We focus on the scope of proprietary software under continuous delivery. We performed an exploratory case study where code branch and statement coverage metrics were collected for each version of a proprietary software together with real usage data of the system. We then measured the ability to explain the bug‐fixing activity after version release using code coverage levels. We found that traditional statement coverage has a moderate negative correlation with bug‐fixing activities, whereas statement coverage merged with the operational profile has a large negative correlation with higher confidence. Developers can consider operational information as an important factor of influence that should be analysed, among other factors, together with code coverage to assess the fault detection capability of a test suite.

中文翻译:

操作配置文件覆盖范围可以解释发布后的错误检测吗?

为了提供可靠的软件,开发人员可能依赖测试套件的故障检测功能。为了评估此功能,他们可以在软件发布之前应用代码覆盖率指标。但是,最近的研究结果表明,这些指标可能无法为评估提供坚实的基础。而且,修复故障需要付出一定的代价,并且并非所有故障都对软件可靠性有相同的影响。从这个意义上讲,操作测试旨在评估对用户更有价值的系统部分。这项工作的目的是调查传统代码覆盖率和与操作信息合并的代码覆盖率是否与发布后的错误检测有关。我们专注于持续交付的专有软件的范围。我们进行了一个探索性案例研究,其中为每个版本的专有软件收集了代码分支和语句覆盖率度量以及系统的实际使用数据。然后,我们测量了使用代码覆盖级别来解释版本发布后的错误修复活动的能力。我们发现,传统的语句覆盖率与错误修复活动之间具有中等程度的负相关性,而与操作概要合并的语句覆盖率则具有较大的负相关性,并且具有较高的置信度。开发人员可以将操作信息视为重要的影响因素,除其他因素外,还要与代码覆盖率一起评估评估测试套件的故障检测能力。然后,我们测量了使用代码覆盖级别来解释版本发布后的错误修复活动的能力。我们发现,传统的语句覆盖率与错误修复活动之间具有中等程度的负相关性,而与操作概要合并的语句覆盖率则具有较大的负相关性,并且具有较高的置信度。开发人员可以将操作信息视为重要的影响因素,除其他因素外,还要结合代码覆盖率来评估测试套件的故障检测能力,这些都是分析因素。然后,我们测量了使用代码覆盖级别在版本发布后解释错误修复活动的能力。我们发现,传统的语句覆盖率与错误修复活动之间具有中等程度的负相关性,而与操作概要合并的语句覆盖率则具有较大的负相关性,并且具有较高的置信度。开发人员可以将操作信息视为重要的影响因素,除其他因素外,还要与代码覆盖率一起评估评估测试套件的故障检测能力。而将报表覆盖率与操作概要合并后,则具有较大的负相关关系,并具有较高的置信度。开发人员可以将操作信息视为重要的影响因素,除其他因素外,还要与代码覆盖率一起评估评估测试套件的故障检测能力。而将报表覆盖率与操作概要合并后,则具有较大的负相关关系,并具有较高的置信度。开发人员可以将操作信息视为重要的影响因素,除其他因素外,还要结合代码覆盖率来评估测试套件的故障检测能力,这些都是分析因素。
更新日期:2020-07-17
down
wechat
bug