当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Large-Scale Manual Validation of Bug Fixing Commits: A Fine-grained Analysis of Tangling
arXiv - CS - Software Engineering Pub Date : 2020-11-12 , DOI: arxiv-2011.06244
Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodr\'iguez-P\'erez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu, Diego Marcilio, Omar Alam, Abdullah Aldaeej, Idan Amit, Burak Turhan, Simon Eismann, Anna-Katharina Wickert, Ivano Malavolta, Matus Sulir, Fatemeh Fard, Austin Z. Henley, Stratos Kourtzanidis, Eray Tuzun, Christoph Treude, Simin Maleki Shamasbi, Ivan Pashchenko, Marvin Wyrich, James Davis, Alexander Serebrenik, Ella Albrecht, Ethem Utku Aktas, Daniel Str\"uber, Johannes Erbel

Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17\% and 32\% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66\% to 87\%. We find that about 11\% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3\% to 47\% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.

中文翻译:

Bug 修复提交的大规模手动验证:Tangle 的细粒度分析

上下文:纠结提交是对软件的更改,可以同时解决多个问题。对于对 bug 感兴趣的研究人员,纠结提交意味着他们实际上不仅研究 bug,而且还研究与 bug 研究无关的其他问题。目标:我们希望提高我们对纠缠的普遍性以及在错误修复提交中纠缠在一起的更改类型的理解。方法:我们使用众包方法进行手动标记,以验证哪些更改有助于错误修复提交中每一行的错误修复。每条线由四个参与者标记。如果至少三个参与者同意同一个标签,我们就达成了共识。结果:我们估计在错误修复提交的所有更改中,有 17\% 到 32\% 会修改源代码以修复潜在问题。然而,当我们只考虑对生产代码文件的更改时,这个比例会增加到 66\% 到 87\%。我们发现大约 11% 的线条难以标记,导致参与者之间出现积极的分歧。由于确认的缠结和我们数据的不确定性,我们估计 3\% 到 47\% 的数据在没有手动解开的情况下是嘈杂的,具体取决于用例。结论:缠结提交在错误修复中非常普遍,并可能导致数据中出现大量噪音。先前的研究表明,这种噪音可能会改变结果。作为研究人员,我们应该持怀疑态度,并假设未经验证的数据可能非常嘈杂,除非另有证明。由于确认的缠结和我们数据的不确定性,我们估计 3\% 到 47\% 的数据在没有手动解开的情况下是嘈杂的,具体取决于用例。结论:缠结提交在错误修复中非常普遍,并可能导致数据中出现大量噪音。先前的研究表明,这种噪音可能会改变结果。作为研究人员,我们应该持怀疑态度,并假设未经验证的数据可能非常嘈杂,除非另有证明。由于确认的缠结和我们数据的不确定性,我们估计 3\% 到 47\% 的数据在没有手动解开的情况下是嘈杂的,具体取决于用例。结论:缠结提交在错误修复中非常普遍,并可能导致数据中出现大量噪音。先前的研究表明,这种噪音可能会改变结果。作为研究人员,我们应该持怀疑态度,并假设未经验证的数据可能非常嘈杂,除非另有证明。
更新日期:2020-11-19
down
wechat
bug