当前位置: X-MOL 学术Empir. Software Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generating API tags for tutorial fragments from Stack Overflow
Empirical Software Engineering ( IF 4.1 ) Pub Date : 2021-05-08 , DOI: 10.1007/s10664-021-09962-8
Di Wu , Xiao-Yuan Jing , Hongyu Zhang , Bing Li , Yu Xie , Baowen Xu

API tutorials are important learning resources as they explain how to use certain APIs in a given programming context. An API tutorial can be split into a number of units. Consecutive units that describe a same topic are often called a tutorial fragment. We consider the API explained by a tutorial fragment as an API tag. Generating API tags for a tutorial fragment can help understand, navigate, and retrieve the fragment. Existing approaches often do not perform well on API tag generation due to high manual effort and low accuracy. Like API tutorials, Stack Overflow (SO) is also an important learning resource that provides the explanations of APIs. Thus, SO posts also contain API tags. Besides, API tags of SO posts are abundant and can be extracted easily. In this paper, we propose a novel approach ATTACK (stands for A PI T ag for T utorial frA gments using C rowd K nowledge), which can automatically generate API tags for tutorial fragments from SO posts. ATTACK first constructs \(\left \langle Q\&A\ pair, tag\ set \right \rangle \) pairs by extracting API tags of SO posts. Then, it trains a deep neural network with the attention mechanism to learn the semantic relatedness between Q&A pairs and the associated API tags, taking into consideration both textual descriptions and code in a Q&A pair. Finally, the trained model is used to generate API tags for tutorial fragments. We evaluate ATTACK on public Java and Android datasets containing 43,132 \(\left \langle Q\&A\ pair, tag\ set \right \rangle \) pairs. Experimental results show that ATTACK is effective and outperforms the state-of-the-art approaches in terms of F-Measure. Our user study further confirms the effectiveness of ATTACK in generating API tags for tutorial fragments. We also apply ATTACK to document linking and the results confirm the usefulness of API tags generated by ATTACK.



中文翻译:

从Stack Overflow生成用于教程片段的API标签

API教程是重要的学习资源,因为它们解释了如何在给定的编程环境中使用某些API。API教程可以分为多个单元。描述同一主题的连续单元通常称为教程片段。我们将教程片段解释的API视为API标签。为教程片段生成API标签可以帮助理解,导航和检索片段。现有的方法由于人工工作量大和准确性低,通常在API标签生成上效果不佳。与API教程一样,堆栈溢出(SO)也是重要的学习资源,它提供了API的说明。因此,SO帖子也包含API标签。此外,SO帖子的API标签非常丰富,并且易于提取。在本文中,我们提出了一种新颖的方法ATTACK(代表PI Ť AG进行Ť utorial FRgments使用Ç rowd ķ nowledge),其可以自动地生成用于从SO帖教程片段API标签。攻击第一构造\(\ left \ langle Q \&A \对,标记\设置\ right \ rangle \)对,方法是提取SO帖子的API标签。然后,它会利用注意力机制训练一个深度神经网络,以在考虑问答对中的文本描述和代码的情况下,学习问答对与关联的API标签之间的语义相关性。最后,训练有素的模型用于生成教程片段的API标签。我们在包含43,132 \(\ left \ langle Q \&A \ pair,tag \ set \ right \ rangle \)的公共Java和Android数据集上评估ATTACK对。实验结果表明,ATTACK是有效的,并且在F-Measure方面优于最新方法。我们的用户研究进一步证实了ATTACK在为教程片段生成API标签方面的有效性。我们还将ATTACK应用于文档链接,并且结果证实了ATTACK生成的API标签的有用性。

更新日期:2021-05-08
down
wechat
bug