Do code review measures explain the incidence of post-release defects?,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Do code review measures explain the incidence of post-release defects?
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2020-06-29 , DOI: 10.1007/s10664-020-09837-4
Andrey Krutauz , Tapajit Dey , Peter C. Rigby , Audris Mockus

Aim In contrast to studies of defects found during code review, we aim to clarify whether code review measures can explain the prevalence of post-release defects. Method We replicate McIntosh et al.’s (Empirical Softw. Engg. 21 (5): 2146–2189, 2016 ) study that uses additive regression to model the relationship between defects and code reviews. To increase external validity, we apply the same methodology on a new software project. We discuss our findings with the first author of the original study, McIntosh. We then investigate how to reduce the impact of correlated predictors in the variable selection process and how to increase understanding of the inter-relationships among the predictors by employing Bayesian Network (BN) models. Context As in the original study, we use the same measures authors obtained for Qt project in the original study. We mine data from version control and issue tracker of Google Chrome and operationalize measures that are close analogs to the large collection of code, process, and code review measures used in the replicated the study. Results Both the data from the original study and the Chrome data showed high instability of the influence of code review measures on defects with the results being highly sensitive to variable selection procedure. Models without code review predictors had as good or better fit than those with review predictors. Replication, however, confirms with the bulk of prior work showing that prior defects, module size, and authorship have the strongest relationship to post-release defects. The application of BN models helped explain the observed instability by demonstrating that the review-related predictors do not affect post-release defects directly and showed indirect effects. For example, changes that have no review discussion tend to be associated with files that have had many prior defects which in turn increase the number of post-release defects. We hope that similar analyses of other software engineering techniques may also yield a more nuanced view of their impact. Our replication package including our data and scripts is publicly available (Replication package 2018 ).

中文翻译：

代码审查措施是否解释了发布后缺陷的发生率？

目的与代码审查期间发现的缺陷研究相反，我们旨在澄清代码审查措施是否可以解释发布后缺陷的普遍性。方法我们复制 McIntosh 等人 (Empirical Softw. Engg. 21 (5): 2146–2189, 2016 ) 的研究，该研究使用加性回归来模拟缺陷和代码审查之间的关系。为了提高外部有效性，我们将相同的方法应用于新的软件项目。我们与原始研究的第一作者 McIntosh 讨论了我们的发现。然后，我们研究如何减少相关预测变量在变量选择过程中的影响，以及如何通过采用贝叶斯网络 (BN) 模型来增加对预测变量之间相互关系的理解。背景与原始研究一样，我们使用作者在原始研究中为 Qt 项目获得的相同度量。我们从 Google Chrome 的版本控制和问题跟踪器中挖掘数据，并实施与复制研究中使用的大量代码、流程和代码审查措施非常相似的措施。结果原始研究的数据和 Chrome 数据都显示代码审查措施对缺陷的影响高度不稳定性，结果对变量选择过程高度敏感。没有代码审查预测器的模型与具有审查预测器的模型一样好或更好。然而，复制证实了大量先前的工作表明先前的缺陷、模块大小和作者身份与发布后的缺陷有最强的关系。BN 模型的应用通过证明与审查相关的预测因素不直接影响发布后缺陷并显示出间接影响，有助于解释观察到的不稳定性。例如，没有审查讨论的更改往往与具有许多先前缺陷的文件相关联，这反过来又增加了发布后缺陷的数量。我们希望对其他软件工程技术的类似分析也可以对它们的影响产生更微妙的看法。我们的复制包包括我们的数据和脚本是公开可用的（Replication package 2018）。我们希望对其他软件工程技术的类似分析也可以对它们的影响产生更微妙的看法。我们的复制包包括我们的数据和脚本是公开可用的（Replication package 2018）。我们希望对其他软件工程技术的类似分析也可以对它们的影响产生更微妙的看法。我们的复制包包括我们的数据和脚本是公开可用的（Replication package 2018）。

更新日期：2020-06-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11