Code smell detection using multi-label classification approach,Software Quality Journal

当前位置： X-MOL 学术 › Software Qual. J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Code smell detection using multi-label classification approach
Software Quality Journal ( IF 1.7 ) Pub Date : 2020-04-04 , DOI: 10.1007/s11219-020-09498-y
Thirupathi Guggulothu , Salman Abdul Moiz

Code smells are characteristics of the software that indicates a code or design problem which can make software hard to understand, evolve, and maintain. There are several code smell detection tools proposed in the literature, but they produce different results. This is because smells are informally defined or subjective in nature. Machine learning techniques help in addressing the issues of subjectivity, which can learn and distinguish the characteristics of smelly and non-smelly source code elements (classes or methods). However, the existing machine learning techniques can only detect a single type of smell in the code element that does not correspond to a real-world scenario as a single element can have multiple design problems (smells). Further, the mechanisms proposed in the literature could not detect code smells by considering the correlation (co-occurrence) among them. To address these shortcomings, we propose and investigate the use of multi-label classification (MLC) methods to detect whether the given code element is affected by multiple smells or not. In this proposal, two code smell datasets available in the literature are converted into a multi-label dataset (MLD). In the MLD, we found that there is a positive correlation between the two smells (long method and feature envy). In the classification phase, the two methods of MLC considered the correlation among the smells and enhanced the performance (on average more than 95% accuracy) for the 10-fold cross-validation with the ten iterations. The findings reported help the researchers and developers in prioritizing the critical code elements for refactoring based on the number of code smells detected.

中文翻译：

使用多标签分类方法进行代码异味检测

代码异味是软件的特征，表明代码或设计问题会使软件难以理解、发展和维护。文献中提出了几种代码异味检测工具，但它们会产生不同的结果。这是因为气味本质上是非正式定义的或主观的。机器学习技术有助于解决主观性问题，它可以学习和区分臭味和非臭味源代码元素（类或方法）的特征。然而，现有的机器学习技术只能检测代码元素中与现实世界场景不对应的单一类型的气味，因为单个元素可能有多个设计问题（气味）。更多，文献中提出的机制无法通过考虑它们之间的相关性（共现）来检测代码气味。为了解决这些缺点，我们提出并研究了使用多标签分类 (MLC) 方法来检测给定代码元素是否受到多种气味的影响。在该提案中，文献中可用的两个代码气味数据集被转换为多标签数据集（MLD）。在MLD中，我们发现两种气味（长方法和特征羡慕）之间存在正相关关系。在分类阶段，MLC 的两种方法考虑了气味之间的相关性，并在十次迭代的 10 倍交叉验证中提高了性能（平均超过 95% 的准确率）。

更新日期：2020-04-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11