当前位置: X-MOL 学术J. Syst. Softw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Technical debt forecasting: An empirical study on open-source repositories
Journal of Systems and Software ( IF 3.7 ) Pub Date : 2020-12-01 , DOI: 10.1016/j.jss.2020.110777
Dimitrios Tsoukalas , Dionysios Kehagias , Miltiadis Siavvas , Alexander Chatzigeorgiou

Abstract Technical debt (TD) is commonly used to indicate additional costs caused by quality compromises that can yield short-term benefits in the software development process, but may negatively affect the long-term quality of software products. Predicting the future value of TD could facilitate decision-making tasks regarding software maintenance and assist developers and project managers in taking proactive actions regarding TD repayment. However, no notable contributions exist in the field of TD forecasting, indicating that it is a scarcely investigated field. To this end, in the present paper, we empirically evaluate the ability of machine learning (ML) methods to model and predict TD evolution. More specifically, an extensive study is conducted, based on a dataset that we constructed by obtaining weekly snapshots of fifteen open source software projects over three years and using two popular static analysis tools to extract software-related metrics that can act as TD predictors. Subsequently, based on the identified TD predictors, a set of TD forecasting models are produced using popular ML algorithms and validated for various forecasting horizons. The results of our analysis indicate that linear Regularization models are able to fit and provide meaningful forecasts of TD evolution for shorter forecasting horizons, while the non-linear Random Forest regression performs better than the linear models for longer forecasting horizons. In most of the cases, the future TD value is captured with a sufficient level of accuracy. These models can be used to facilitate planning for software evolution budget and time allocation. The approach presented in this paper provides a basis for predictive TD analysis, suitable for projects with a relatively long history. To the best of our knowledge, this is the first study that investigates the feasibility of using ML models for forecasting TD.

中文翻译:

技术债务预测:开源存储库的实证研究

摘要 技术债务 (TD) 通常用于表示由质量妥协引起的额外成本,这些成本可以在软件开发过程中产生短期收益,但可能会对软件产品的长期质量产生负面影响。预测 TD 的未来价值可以促进有关软件维护的决策任务,并帮助开发人员和项目经理就 TD 还款采取主动行动。然而,在TD预测领域没有显着的贡献,表明它是一个鲜为人知的领域。为此,在本文中,我们凭经验评估了机器学习 (ML) 方法对 TD 演化进行建模和预测的能力。更具体地说,进行了广泛的研究,基于我们通过获取 15 个开源软件项目三年内每周快照并使用两种流行的静态分析工具来提取可作为 TD 预测器的软件相关指标构建的数据集。随后,基于确定的 TD 预测变量,使用流行的 ML 算法生成一组 TD 预测模型,并针对各种预测范围进行验证。我们的分析结果表明,线性正则化模型能够为较短的预测范围拟合并提供有意义的 TD 演化预测,而非线性随机森林回归的性能优于较长预测范围的线性模型。在大多数情况下,以足够的准确度捕获未来的 TD 值。这些模型可用于促进软件演进预算和时间分配的规划。本文提出的方法为预测性 TD 分析提供了基础,适用于历史相对较长的项目。据我们所知,这是第一项调查使用 ML 模型预测 TD 的可行性的研究。
更新日期:2020-12-01
down
wechat
bug