An empirical study on clone consistency prediction based on machine learning,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An empirical study on clone consistency prediction based on machine learning
Information and Software Technology ( IF 3.9 ) Pub Date : 2021-03-23 , DOI: 10.1016/j.infsof.2021.106573
Fanlong Zhang , Siau-cheng Khoo

Context:

Code Clones have been accepted as a common phenomenon in software, thanks to the increasing demand for rapid production of software. The existence of code clones is recognized by developers in the form of clone group, which includes several pieces of clone fragments that are similar to one another. A change in one of these clone fragments may indicate necessary “consistent changes” are required for the rest of the clones within the same group, which can increase extra maintenance costs. A failure in making such consistent change when it is necessary is commonly known as a “clone consistency-defect”, which can adversely impact software maintainability.

Objective:

Predicting the need for “clone consistent changes” after successful clone-creating or clone-changing operations can help developers maintain clone changes effectively, avoid consistency-defects and reduce maintenance cost.

Method:

In this work, we use several sets of attributes in two scenarios of clone operations (clone-creating and clone-changing), and conduct an empirical study on five different machine-learning methods to assess each of their clone consistency predictability — whether any one of the clone operations will require or be free of clone consistency maintenance in future.

Results:

We perform our experiments on eight open-source projects. Our study shows that such predictions can be reasonably effective both for clone-creating and changing operating instances. We also investigate the use of five different machine-learning methods for predictions and show that our selected features are effective in predicting the needs of consistency-maintenance across all selected machine-learning methods.

Conclusion:

The empirical study conducted here demonstrates that the models developed by different machine-learning methods with the specified sets of attributes have the ability to perform clone-consistency prediction.

中文翻译：

基于机器学习的克隆一致性预测的实证研究

语境：

由于对快速生产软件的需求不断增加，代码克隆已被接受为软件中的普遍现象。开发人员以克隆组的形式识别代码克隆的存在，克隆组包括几块彼此相似的克隆片段。这些克隆片段之一的更改可能表明同一组中其余克隆都需要进行必要的“一致更改”，这可能会增加额外的维护成本。在必要时进行此类一致更改的失败通常被称为“克隆一致性缺陷”，这可能会对软件的可维护性产生不利影响。