当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Method to Assess and Argue for Practical Significance in Software Engineering
arXiv - CS - Software Engineering Pub Date : 2018-09-26 , DOI: arxiv-1809.09849
Richard Torkar, Carlo A. Furia, Robert Feldt, Francisco Gomes de Oliveira Neto, Lucas Gren, Per Lenberg, Neil A. Ernst

A key goal of empirical research in software engineering is to assess practical significance, which answers whether the observed effects of some compared treatments show a relevant difference in practice in realistic scenarios. Even though plenty of standard techniques exist to assess statistical significance, connecting it to practical significance is not straightforward or routinely done; indeed, only a few empirical studies in software engineering assess practical significance in a principled and systematic way. In this paper, we argue that Bayesian data analysis provides suitable tools to assess practical significance rigorously. We demonstrate our claims in a case study comparing different test techniques. The case study's data was previously analyzed (Afzal et al., 2015) using standard techniques focusing on statistical significance. Here, we build a multilevel model of the same data, which we fit and validate using Bayesian techniques. Our method is to apply cumulative prospect theory on top of the statistical model to quantitatively connect our statistical analysis output to a practically meaningful context. This is then the basis both for assessing and arguing for practical significance. Our study demonstrates that Bayesian analysis provides a technically rigorous yet practical framework for empirical software engineering. A substantial side effect is that any uncertainty in the underlying data will be propagated through the statistical model, and its effects on practical significance are made clear. Thus, in combination with cumulative prospect theory, Bayesian analysis supports seamlessly assessing practical significance in an empirical software engineering context, thus potentially clarifying and extending the relevance of research for practitioners.

中文翻译:

一种评估和论证软件工程实际意义的方法

软件工程实证研究的一个关键目标是评估实际意义,这回答了观察到的一些比较处理的效果是否在现实场景中的实践中显示出相关差异。尽管存在大量用于评估统计显着性的标准技术,但将其与实际显着性联系起来并不简单,也不是例行公事。事实上,只有少数软件工程的实证研究以原则和系统的方式评估了实际意义。在本文中,我们认为贝叶斯数据分析提供了合适的工具来严格评估实际意义。我们在比较不同测试技术的案例研究中证明了我们的主张。案例研究的数据之前使用侧重于统计显着性的标准技术进行了分析(Afzal 等人,2015 年)。在这里,我们构建了一个相同数据的多级模型,我们使用贝叶斯技术对其进行拟合和验证。我们的方法是在统计模型之上应用累积前景理论,将我们的统计分析输出定量连接到实际有意义的上下文。这是评估和论证实际意义的基础。我们的研究表明,贝叶斯分析为经验软件工程提供了一个技术上严格但实用的框架。一个实质性的副作用是基础数据中的任何不确定性都将通过统计模型传播,并且它对实际意义的影响很明显。因此,结合累积前景理论,贝叶斯分析支持在经验软件工程环境中无缝评估实际意义,
更新日期:2020-11-04
down
wechat
bug