Effort-Aware semi-Supervised just-in-Time defect prediction,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Effort-Aware semi-Supervised just-in-Time defect prediction
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-06-01 , DOI: 10.1016/j.infsof.2020.106364
Weiwei Li , Wenzhou Zhang , Xiuyi Jia , Zhiqiu Huang

Context

Software defect prediction is an important technique that can help practitioners allocate their quality assurance efforts. In recent years, just-in-time (JIT) defect prediction has attracted considerable interest, as it enables developers to identify risky changes at check-in time.

Objective

Many studies have conducted research from supervised and unsupervised perspectives. A model that does not rely on label information would be preferred. However, the performance of unsupervised models proposed by previous studies in the classification scenario was unsatisfactory due to the lack of supervised information. Furthermore, most supervised models fail to outperform simple unsupervised models in the ranking scenario. To overcome this weakness, we conduct research from the semi-supervised perspective that only requires a small quantity of labeled data for training.

Method

In this paper, we propose a semi-supervised model for JIT defect prediction named Effort-Aware Tri-Training (EATT), which is an effort-aware method using a greedy strategy to rank changes. We compare EATT with the state-of-the-art supervised and unsupervised models with respect to different labeled rate.

Results

The experimental results on six open-source projects demonstrate that EATT outperforms existing supervised and unsupervised models for effort-aware JIT defect prediction, and has similar or superior performance in classifying defect-inducing changes.

Conclusion

The results show that EATT can not only achieve high classification accuracy as supervised models, but also offer more practical value than other compared models from the perspective of the effort needed to review changes.

中文翻译：