当前位置: X-MOL 学术Empir. Software Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ID-correspondence: a measure for detecting evolutionary coupling
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2021-01-01 , DOI: 10.1007/s10664-020-09921-9
Manishankar Mondal , Banani Roy , Chanchal K. Roy , Kevin A. Schneider

Evolutionary coupling is a well investigated phenomenon in software maintenance research and practice. Association rules and two related measures, support and confidence , have been used to identify evolutionary coupling among program entities. However, these measures only emphasize the co-change (i.e., changing together) frequency of entities and cannot determine whether the entities co-evolved by experiencing related changes. Consequently, the approach reports false positives and fails to detect evolutionary coupling among infrequently co-changed entities. We propose a new measure, identifier correspondence ( id-correspondence ), that quantifies the extent to which changes that occurred to the co-changed entities are related based on identifier similarity. Identifiers are the names given to different program entities such as variables, methods, classes, packages, interfaces, structures, unions etc. We use Dice-Sørensen co-efficient for measuring lexical similarity between the identifiers involved in the changed lines of the co-changed entities. Our investigation on thousands of revisions from nine subject systems covering three programming languages shows that id-correspondence can considerably improve the detection accuracy of evolutionary coupling. It outperforms the existing state-of-the-art evolutionary coupling based techniques with significantly higher recall and F-score in predicting future co-change candidates.

中文翻译:

ID-correspondence:一种检测进化耦合的方法

进化耦合是软件维护研究和实践中一个很好研究的现象。关联规则和两个相关的度量,支持度和置信度,已被用于识别程序实体之间的演化耦合。然而,这些措施只强调实体的共同变化(即一起变化)频率,并不能确定实体是否通过经历相关变化而共同进化。因此,该方法会报告误报并且无法检测到不常发生共同变化的实体之间的进化耦合。我们提出了一种新的度量,即标识符对应 (id-correspondence),它基于标识符相似性量化了共同更改实体发生的更改的相关程度。标识符是赋予不同程序实体的名称,例如变量、方法、类、包、接口、结构、联合等。我们使用 Dice-Sørensen 系数来衡量共同变化实体的变化行中涉及的标识符之间的词汇相似性。我们对涵盖三种编程语言的九个学科系统的数千次修订的调查表明,id-correspondence 可以显着提高进化耦合的检测精度。它优于现有的最先进的基于进化耦合的技术,在预测未来共变候选者方面具有显着更高的召回率和 F 分数。我们对涵盖三种编程语言的九个主题系统的数千次修订的调查表明,id-correspondence 可以显着提高进化耦合的检测精度。它优于现有的最先进的基于进化耦合的技术,在预测未来共变候选者方面具有显着更高的召回率和 F 分数。我们对涵盖三种编程语言的九个主题系统的数千次修订的调查表明,id-correspondence 可以显着提高进化耦合的检测精度。它优于现有的最先进的基于进化耦合的技术,在预测未来共变候选者方面具有显着更高的召回率和 F 分数。
更新日期:2021-01-01
down
wechat
bug