Deep Learning Based Code Smell Detection,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Learning Based Code Smell Detection
IEEE Transactions on Software Engineering ( IF 6.5 ) Pub Date : 2019-08-20 , DOI: 10.1109/tse.2019.2936376
Hui Liu , Jiahao Jin , Zhifeng Xu , Yifan Bu , Yanzhen Zou , Lu Zhang

Code smells are structures in the source code that suggest the possibility of refactorings. Consequently, developers may identify refactoring opportunities by detecting code smells. However, manual identification of code smells is challenging and tedious. To this end, a number of approaches have been proposed to identify code smells automatically or semi-automatically. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics into predictions. However, it is challenging to manually select the best features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features of source code for code smell detection, and could automatically build the complex mapping between such features and predictions. A big challenge for deep learning based smell detection is that deep learning often requires a large number of labeled training data (to tune a large number of parameters within the employed deep neural network) whereas existing datasets for code smell detection are rather small. To this end, we propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. As an initial try, we apply the proposed approach to four common and well-known code smells, i.e., feature envy, long method, large class, and misplaced class. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art.

中文翻译：

基于深度学习的代码气味检测

代码异味是源代码中暗示重构可能性的结构。因此，开发人员可以通过检测代码异味来识别重构机会。然而，手动识别代码异味是具有挑战性和乏味的。为此，已经提出了许多方法来自动或半自动地识别代码异味。大多数此类方法依赖于手动设计的启发式方法将手动选择的源代码指标映射到预测中。然而，手动选择最佳特征是具有挑战性的。手动构建最佳启发式算法也很困难。为此，在本文中，我们提出了一种基于深度学习的新方法来检测代码异味。关键的见解是深度神经网络和先进的深度学习技术可以自动选择源代码的特征进行代码气味检测，并可以自动构建这些特征和预测之间的复杂映射。基于深度学习的气味检测的一个巨大挑战是，深度学习通常需要大量标记的训练数据（以调整所采用的深度神经网络中的大量参数），而现有的代码气味检测数据集相当小。为此，我们提出了一种为基于神经网络的分类器生成标记训练数据的自动方法，该方法不需要任何人工干预。作为初步尝试，我们将所提出的方法应用于四种常见且众所周知的代码味道，即特征嫉妒、长方法、大类和错位类。

更新日期：2019-08-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11