当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved Code Summarization via a Graph Neural Network
arXiv - CS - Software Engineering Pub Date : 2020-04-06 , DOI: arxiv-2004.02843
Alexander LeClair, Sakib Haque, Lingfei Wu, Collin McMillan

Automatic source code summarization is the task of generating natural language descriptions for source code. Automatic code summarization is a rapidly expanding research area, especially as the community has taken greater advantage of advances in neural network and AI technologies. In general, source code summarization techniques use the source code as input and outputs a natural language description. Yet a strong consensus is developing that using structural information as input leads to improved performance. The first approaches to use structural information flattened the AST into a sequence. Recently, more complex approaches based on random AST paths or graph neural networks have improved on the models using flattened ASTs. However, the literature still does not describe the using a graph neural network together with source code sequence as separate inputs to a model. Therefore, in this paper, we present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries. We evaluate our technique using a data set of 2.1 million Java method-comment pairs and show improvement over four baseline techniques, two from the software engineering literature, and two from machine learning literature.

中文翻译:

通过图神经网络改进代码摘要

自动源代码摘要是为源代码生成自然语言描述的任务。自动代码摘要是一个快速扩展的研究领域,尤其是在社区已经更多地利用了神经网络和 AI 技术的进步之后。通常,源代码摘要技术使用源代码作为输入并输出自然语言描述。然而,一个强烈的共识正在形成,即使用结构信息作为输入可以提高性能。第一种使用结构信息的方法将 AST 扁平化为一个序列。最近,基于随机 AST 路径或图神经网络的更复杂的方法在使用扁平 AST 的模型上得到了改进。然而,文献仍然没有描述将图神经网络与源代码序列一起用作模型的单独输入。因此,在本文中,我们提出了一种使用基于图的神经架构的方法,该架构可以更好地匹配 AST 的默认结构来生成这些摘要。我们使用包含 210 万个 Java 方法-评论对的数据集评估我们的技术,并展示了对四种基线技术的改进,其中两种来自软件工程文献,另两种来自机器学习文献。
更新日期:2020-04-08
down
wechat
bug