当前位置: X-MOL 学术IEEE Trans. Softw. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Impact of Correlated Metrics on the Interpretation of Defect Models
IEEE Transactions on Software Engineering ( IF 7.4 ) Pub Date : 2021-02-01 , DOI: 10.1109/tse.2019.2891758
Jirayus Jiarpakdee , Chakkrit Tantithamthavorn , Ahmed E. Hassan

Defect models are analytical models for building empirical theories related to software quality. Prior studies often derive knowledge from such models using interpretation techniques, e.g., ANOVA Type-I. Recent work raises concerns that correlated metrics may impact the interpretation of defect models. Yet, the impact of correlated metrics in such models has not been investigated. In this paper, we investigate the impact of correlated metrics on the interpretation of defect models and the improvement of the interpretation of defect models when removing correlated metrics. Through a case study of 14 publiclyavailable defect datasets, we find that (1) correlated metrics have the largest impact on the consistency, the level of discrepancy, and the direction of the ranking of metrics, especially for ANOVA techniques. On the other hand, we find that removing all correlated metrics (2) improves the consistency of the produced rankings regardless of the ordering of metrics (except for ANOVA Type-I); (3) improves the consistency of ranking of metrics among the studied interpretation techniques; (4) impacts the model performance by less than 5 percentage points. Thus, when one wishes to derive sound interpretation from defect models, one must (1) mitigate correlated metrics especially for ANOVA analyses; and (2) avoid using ANOVA Type-I even if all correlated metrics are removed.

中文翻译:

相关度量对缺陷模型解释的影响

缺陷模型是用于构建与软件质量相关的经验理论的分析模型。先前的研究通常使用解释技术(例如 ANOVA Type-I)从此类模型中获得知识。最近的工作引起了人们的担忧,即相关指标可能会影响对缺陷模型的解释。然而,尚未研究此类模型中相关指标的影响。在本文中,我们研究了相关度量对缺陷模型解释的影响以及在删除相关度量时对缺陷模型解释的改进。通过对 14 个公开缺陷数据集的案例研究,我们发现 (1) 相关指标对一致性、差异水平和指标排名方向的影响最大,尤其是对于方差分析技术。另一方面,我们发现,无论指标的顺序如何(ANOVA Type-I 除外),删除所有相关指标 (2) 都可以提高生成的排名的一致性;(3) 提高了所研究的解释技术之间指标排序的一致性;(4) 对模型性能的影响不到 5 个百分点。因此,当人们希望从缺陷模型中得出合理的解释时,必须 (1) 减少相关指标,尤其是对于方差分析分析;(2) 即使删除了所有相关指标,也要避免使用 ANOVA Type-I。(4) 对模型性能的影响不到 5 个百分点。因此,当人们希望从缺陷模型中得出合理的解释时,必须 (1) 减少相关指标,尤其是对于方差分析分析;(2) 即使删除了所有相关指标,也要避免使用 ANOVA Type-I。(4) 对模型性能的影响不到 5 个百分点。因此,当人们希望从缺陷模型中得出合理的解释时,必须 (1) 减少相关指标,尤其是对于方差分析分析;(2) 即使删除了所有相关指标,也要避免使用 ANOVA Type-I。
更新日期:2021-02-01
down
wechat
bug