当前位置: X-MOL 学术Empir. Software Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weighted software metrics aggregation and its application to defect prediction
Empirical Software Engineering ( IF 4.1 ) Pub Date : 2021-06-23 , DOI: 10.1007/s10664-021-09984-2
Maria Ulan , Welf Löwe , Morgan Ericsson , Anna Wingkvist

It is a well-known practice in software engineering to aggregate software metrics to assess software artifacts for various purposes, such as their maintainability or their proneness to contain bugs. For different purposes, different metrics might be relevant. However, weighting these software metrics according to their contribution to the respective purpose is a challenging task. Manual approaches based on experts do not scale with the number of metrics. Also, experts get confused if the metrics are not independent, which is rarely the case. Automated approaches based on supervised learning require reliable and generalizable training data, a ground truth, which is rarely available. We propose an automated approach to weighted metrics aggregation that is based on unsupervised learning. It sets metrics scores and their weights based on probability theory and aggregates them. To evaluate the effectiveness, we conducted two empirical studies on defect prediction, one on ca. 200 000 code changes, and another ca. 5 000 software classes. The results show that our approach can be used as an agnostic unsupervised predictor in the absence of a ground truth.



中文翻译:

加权软件度量聚合及其在缺陷预测中的应用

软件工程中众所周知的做法是汇总软件度量以评估软件工件以用于各种目的,例如它们的可维护性或它们包含错误的倾向。对于不同的目的,不同的指标可能是相关的。然而,根据它们对各自目的的贡献对这些软件指标进行加权是一项具有挑战性的任务。基于专家的手动方法不会随着指标的数量而扩展。此外,如果指标不是独立的,专家会感到困惑,这种情况很少见。基于监督学习的自动化方法需要可靠且可概括的训练数据,这是一个很少获得的基本事实。我们提出了一种基于无监督学习的加权指标聚合的自动化方法。它根据概率论设置指标分数及其权重,并将它们汇总。为了评估有效性,我们对缺陷预测进行了两项实证研究,一项针对约。200 000 次代码更改,还有另一个大约。5 000 个软件课程。结果表明,我们的方法可以在没有基本事实的情况下用作不可知的无监督预测器。

更新日期:2021-06-23
down
wechat
bug