How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study,ACM Transactions on Software Engineering and Methodology

当前位置： X-MOL 学术 › ACM Trans. Softw. Eng. Methodol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study
ACM Transactions on Software Engineering and Methodology ( IF 6.6 ) Pub Date : 2021-07-23 , DOI: 10.1145/3447247
Zhaoqiang Guo ₁ , Shiran Liu ₁ , Jinping Liu ₂ , Yanhui Li ₁ , Lin Chen ₁ , Hongmin Lu ₁ , Yuming Zhou ₁

Affiliation

Background. Self-admitted technical debt (SATD) is a special kind of technical debt that is intentionally introduced and remarked by code comments. Those technical debts reduce the quality of software and increase the cost of subsequent software maintenance. Therefore, it is necessary to find out and resolve these debts in time. Recently, many automatic approaches have been proposed to identify SATD. Problem. Popular IDEs support a number of predefined task annotation tags for indicating SATD in comments, which have been used in many projects. However, such clear prior knowledge is neglected by existing SATD identification approaches when identifying SATD. Objective. We aim to investigate how far we have really progressed in the field of SATD identification by comparing existing approaches with a simple approach that leverages the predefined task tags to identify SATD. Method. We first propose a simple heuristic approach that fuzzily Matches task Annotation Tags ( MAT ) in comments to identify SATD. In nature, MAT is an unsupervised approach, which does not need any data to train a prediction model and has a good understandability. Then, we examine the real progress in SATD identification by comparing MAT against existing approaches. Result. The experimental results reveal that: (1) MAT has a similar or even superior performance for SATD identification compared with existing approaches, regardless of whether non-effort-aware or effort-aware evaluation indicators are considered; (2) the SATDs (or non-SATDs) correctly identified by existing approaches are highly overlapped with those identified by MAT ; and (3) supervised approaches misclassify many SATDs marked with task tags as non-SATDs, which can be easily corrected by their combinations with MAT . Conclusion. It appears that the problem of SATD identification has been (unintentionally) complicated by our community, i.e., the real progress in SATD comments identification is not being achieved as it might have been envisaged. We hence suggest that, when many task tags are used in the comments of a target project, future SATD identification studies should use MAT as an easy-to-implement baseline to demonstrate the usefulness of any newly proposed approach.

中文翻译：

我们在识别自我承认的技术债务方面取得了多大进展？全面的实证研究

背景。自我承认的技术债务（SATD）是一种特殊的技术债务，由代码注释有意引入和注释。这些技术债务降低了软件的质量，增加了后续软件维护的成本。因此，有必要及时发现和解决这些债务。最近，已经提出了许多自动方法来识别SATD。问题。流行的 IDE 支持许多预定义的任务注释标签，用于在注释中指示 SATD，这些标签已在许多项目中使用。然而，现有的SATD识别方法在识别SATD时忽略了这种清晰的先验知识。客观的。我们的目标是通过将现有方法与利用预定义任务标签来识别 SATD 的简单方法进行比较，来研究我们在 SATD 识别领域的真正进展。方法。我们首先提出了一种简单的启发式方法，可以模糊匹配任务注释标签（垫) 在注释中识别 SATD。在自然界，垫是一种无监督的方法，不需要任何数据来训练预测模型，具有很好的可理解性。然后，我们通过比较来检验SATD识别的真正进展垫反对现有的方法。结果。实验结果表明：（1）垫与现有方法相比，SATD识别具有相似甚至更优的性能，无论是否考虑非努力感知或努力感知评估指标；(2) 现有方法正确识别的SATDs（或非SATDs）与现有方法识别的SATDs（或非SATDs）高度重叠垫; (3) 监督方法将许多标有任务标签的 SATD 错误分类为非 SATD，这可以通过它们的组合轻松纠正垫.结论。看来，SATD 识别问题已经（无意地）被我们的社区复杂化了，即，SATD 评论识别的真正进展并没有像预期的那样实现。因此，我们建议，当目标项目的评论中使用了许多任务标签时，未来的SATD识别研究应该使用垫作为易于实现的基线，以证明任何新提出的方法的有用性。

更新日期：2021-07-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11