Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt,Frontiers of Computer Science

当前位置： X-MOL 学术 › Front. Comput. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt
Frontiers of Computer Science ( IF 4.2 ) Pub Date : 2021-05-25 , DOI: 10.1007/s11704-020-9281-z
Dongjin Yu , Lin Wang , Xin Chen , Jie Chen

Technical debt is a metaphor for seeking short-term gains at expense of long-term code quality. Previous studies have shown that self-admitted technical debt, which is introduced intentionally, has strong negative impacts on software development and incurs high maintenance overheads. To help developers identify self-admitted technical debt, researchers have proposed many state-of-the-art methods. However, there is still room for improvement about the effectiveness of the current methods, as self-admitted technical debt comments have the characteristics of length variability, low proportion and style diversity. Therefore, in this paper, we propose a novel approach based on the bidirectional long short-term memory (BiLSTM) networks with the attention mechanism to automatically detect self-admitted technical debt by leveraging source code comments. In BiLSTM, we utilize a balanced cross entropy loss function to overcome the class unbalance problem. We experimentally investigate the performance of our approach on a public dataset including 62, 566 code comments from ten open source projects. Experimental results show that our approach achieves 81.75% in terms of precision, 72.24% in terms of recall and 75.86% in terms of F1-score on average and outperforms the state-of-the-art text mining-based method by 8.14%, 5.49% and 6.64%, respectively.

中文翻译：

使用具有关注机制的BiLSTM自动检测自我承认的技术债务

技术债务是一种以牺牲长期代码质量为代价而寻求短期收益的隐喻。先前的研究表明，故意引入的自负技术债务会对软件开发产生严重的负面影响，并导致高昂的维护费用。为了帮助开发人员确定自行承担的技术债务，研究人员提出了许多最先进的方法。但是，由于自承认的技术债务评论具有长度可变，比例低和样式多样的特征，因此当前方法的有效性仍有改进的余地。因此，在本文中，我们提出了一种基于双向长短期记忆（BiLSTM）网络的新颖方法，该方法具有注意力机制，可以通过利用源代码注释自动检测自我承认的技术债务。在BiLSTM中，我们利用平衡的交叉熵损失函数来克服类不平衡问题。我们通过实验研究了我们的方法在公共数据集上的性能，该数据集包含来自十个开源项目的62、566条代码注释。实验结果表明，我们的方法的平均精度达到81.75％，召回率达到72.24％，F1得分达到75.86％，比基于文本挖掘的最新方法高出8.14％，分别为5.49％和6.64％。

更新日期：2021-05-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>