当前位置:
X-MOL 学术
›
arXiv.cs.SE
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Learning Autocompletion from Real-World Datasets
arXiv - CS - Software Engineering Pub Date : 2020-11-09 , DOI: arxiv-2011.04542 Gareth Ari Aye, Seohyun Kim, Hongyu Li
arXiv - CS - Software Engineering Pub Date : 2020-11-09 , DOI: arxiv-2011.04542 Gareth Ari Aye, Seohyun Kim, Hongyu Li
Code completion is a popular software development tool integrated into all
major IDEs. Many neural language models have achieved promising results in
completion suggestion prediction on synthetic benchmarks. However, a recent
study When Code Completion Fails: a Case Study on Real-World Completions
demonstrates that these results may not translate to improvements in real-world
performance. To combat this effect, we train models on real-world code
completion examples and find that these models outperform models trained on
committed source code and working version snapshots by 12.8% and 13.8% accuracy
respectively. We observe this improvement across modeling technologies and show
through A/B testing that it corresponds to a 6.2% increase in programmers'
actual autocompletion usage. Furthermore, our study characterizes a large
corpus of logged autocompletion usages to investigate why training on
real-world examples leads to stronger models.
中文翻译:
从真实世界的数据集中学习自动完成
代码完成是集成到所有主要 IDE 中的流行软件开发工具。许多神经语言模型在合成基准的完成建议预测方面取得了可喜的成果。然而,最近的一项研究当代码完成失败时:真实世界完成案例研究表明,这些结果可能不会转化为现实世界性能的改进。为了消除这种影响,我们在现实世界的代码完成示例上训练模型,发现这些模型的准确率分别比在提交源代码和工作版本快照上训练的模型高 12.8% 和 13.8%。我们观察到跨建模技术的这种改进,并通过 A/B 测试表明它对应于程序员实际自动完成使用量增加了 6.2%。此外,
更新日期:2020-11-10
中文翻译:
从真实世界的数据集中学习自动完成
代码完成是集成到所有主要 IDE 中的流行软件开发工具。许多神经语言模型在合成基准的完成建议预测方面取得了可喜的成果。然而,最近的一项研究当代码完成失败时:真实世界完成案例研究表明,这些结果可能不会转化为现实世界性能的改进。为了消除这种影响,我们在现实世界的代码完成示例上训练模型,发现这些模型的准确率分别比在提交源代码和工作版本快照上训练的模型高 12.8% 和 13.8%。我们观察到跨建模技术的这种改进,并通过 A/B 测试表明它对应于程序员实际自动完成使用量增加了 6.2%。此外,