当前位置: X-MOL 学术IEEE Trans. Reliab. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks
IEEE Transactions on Reliability ( IF 5.0 ) Pub Date : 2020-07-22 , DOI: 10.1109/tr.2020.3001918
Wei Hua , Yulei Sui , Yao Wan , Guangzhong Liu , Guandong Xu

Code cloning, which reuses a fragment of source code via copy-and-paste with or without modifications, is a common way for code reuse and software prototyping. However, the duplicated code fragments often affect software quality, resulting in high maintenance cost. The existing clone detectors using shallow textual or syntactical features to identify code similarity are still ineffective in accurately finding sophisticated functional code clones in real-world code bases. This article proposes functional code clone detector using attention (FCCA), a deep-learning-based code clone detection approach on top of a hybrid code representation by preserving multiple code features, including unstructured (code in the form of sequential tokens) and structured (code in the form of abstract syntax trees and control-flow graphs) information. Multiple code features are fused into a hybrid representation, which is equipped with an attention mechanism that pays attention to important code parts and features that contribute to the final detection accuracy. We have implemented and evaluated FCCA using 275 777 real-world code clone pairs written in Java. The experimental results show that FCCA outperforms several state-of-the-art approaches for detecting functional code clones in terms of accuracy, recall, and F1 score.

中文翻译:


FCCA:使用注意力网络进行功能克隆检测的混合代码表示



代码克隆是通过复制和粘贴(无论是否经过修改)来重用源代码片段,是代码重用和软件原型设计的常见方法。然而,重复的代码片段往往会影响软件质量,导致维护成本高昂。现有的使用浅层文本或语法特征来识别代码相似性的克隆检测器在准确地找到现实世界代码库中复杂的功能代码克隆方面仍然无效。本文提出了使用注意力的功能性代码克隆检测器(FCCA),这是一种基于深度学习的代码克隆检测方法,通过保留多个代码特征,在混合代码表示之上,包括非结构化(顺序标记形式的代码)和结构化(抽象语法树和控制流图形式的代码)信息。多个代码特征被融合成一个混合表示,它配备了注意力机制,关注有助于最终检测精度的重要代码部分和特征。我们使用 275 777 个用 Java 编写的真实代码克隆对来实施和评估 FCCA。实验结果表明,FCCA 在准确性、召回率和 F1 分数方面优于多种检测功能代码克隆的最先进方法。
更新日期:2020-07-22
down
wechat
bug