当前位置: X-MOL 学术Inf. Softw. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sequential coding patterns: How to use them effectively in code recommendation
Information and Software Technology ( IF 3.8 ) Pub Date : 2021-07-19 , DOI: 10.1016/j.infsof.2021.106690
Luiz Laerte Nunes da Silva 1 , Troy Costa Kohwalter 1 , Alexandre Plastino 1 , Leonardo Gresta Paulino Murta 1
Affiliation  

Context:

Some programming constructs frequently appear together in different parts of the code, representing sequential coding patterns throughout the project. These sequential coding patterns can be mined from the project repository and, whenever the code a developer is writing coincides with the beginning of a sequential pattern, the remainder of this pattern can be suggested to the developer. This is equivalent to the usual Code Completion, which suggests syntactic structures based on the line being programmed. However, instead of providing syntactic suggestions for completing the current line, such feature suggests code snippets containing multiple lines.

Objective:

This paper contributes with an in-depth study on how code pattern recommendation can be used effectively.

Method:

We answer three research questions through a quantitative study using a robust experimental infrastructure with a corpus of five open-source projects: (1) “In a code recommendation, how many frequent coding patterns should be presented?”, (2) “What is the impact of filtering sequential patterns by their confidence?”, and (3) “Does the effectiveness of the sequential coding patterns degrade over time?”.

Results:

Our study shows that it is possible to achieve correctness above 80% when using suggestions with the highest confidence values and that a threshold confidence of 30% generally provides better outcomes. Furthermore, it shows that frequent code pattern completion effectiveness tends to degrade 50 commits after the patterns have been mined.

Conclusion:

We could observe that: (1) the top five ranked suggestions are the ones that deliver the best results; (2) the code recommendations that deliver the best results are the ones with the highest confidence values; and (3) the code recommendation performance degrades as the source code evolves because patterns become outdated.



中文翻译:

顺序编码模式:如何在代码推荐中有效地使用它们

语境:

一些编程结构经常一起出现在代码的不同部分,代表整个项目中的顺序编码模式。这些顺序编码模式可以从项目存储库中挖掘,只要开发人员编写的代码与顺序模式的开头一致,就可以向开发人员建议该模式的其余部分。这相当于通常的代码完成,它根据正在编程的行建议句法结构。但是,该功能不会为完成当前行提供句法建议,而是建议包含多行的代码片段。

客观的:

本文深入研究了如何有效地使用代码模式推荐。

方法:

我们通过使用具有五个开源项目语料库的稳健实验基础架构进行的定量研究来回答三个研究问题:(1)“在代码推荐中,应该呈现多少频繁的编码模式?”,(2)“什么是通过置信度过滤序列模式的影响?”,和(3)“序列编码模式的有效性是否会随着时间的推移而降低?”。

结果:

我们的研究表明,当使用具有最高置信度值的建议时,有可能实现 80% 以上的正确性,并且 30% 的阈值置信度通常会提供更好的结果。此外,它表明,在模式被挖掘后,频繁的代码模式完成效率往往会降低 50 次提交。

结论:

我们可以观察到:(1)排名前五的建议是提供最好结果的建议;(2) 提供最佳结果的代码推荐是具有最高置信度值的代码推荐;(3) 代码推荐性能随着源代码的演变而下降,因为模式变得过时了。

更新日期:2021-07-23
down
wechat
bug