On-the-Fly Adaptation of Source Code Models using Meta-Learning,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On-the-Fly Adaptation of Source Code Models using Meta-Learning
arXiv - CS - Software Engineering Pub Date : 2020-03-26 , DOI: arxiv-2003.11768
Disha Shrivastava, Hugo Larochelle and Daniel Tarlow

The ability to adapt to unseen, local contexts is an important challenge that successful models of source code must overcome. One of the most popular approaches for the adaptation of such models is dynamic evaluation. With dynamic evaluation, when running a model on an unseen file, the model is updated immediately after having observed each token in that file. In this work, we propose instead to frame the problem of context adaptation as a meta-learning problem. We aim to train a base source code model that is best able to learn from information in a file to deliver improved predictions of missing tokens. Unlike dynamic evaluation, this formulation allows us to select more targeted information (support tokens) for adaptation, that is both before and after a target hole in a file. We consider an evaluation setting that we call line-level maintenance, designed to reflect the downstream task of code auto-completion in an IDE. Leveraging recent developments in meta-learning such as first-order MAML and Reptile, we demonstrate improved performance in experiments on a large scale Java GitHub corpus, compared to other adaptation baselines including dynamic evaluation. Moreover, our analysis shows that, compared to a non-adaptive baseline, our approach improves performance on identifiers and literals by 44\% and 15\%, respectively.

中文翻译：

使用元学习即时调整源代码模型

适应看不见的本地环境的能力是成功的源代码模型必须克服的一个重要挑战。适应此类模型的最流行方法之一是动态评估。通过动态评估，当在一个看不见的文件上运行模型时，模型会在观察到该文件中的每个标记后立即更新。在这项工作中，我们建议将上下文适应问题构建为元学习问题。我们的目标是训练一个基本源代码模型，它最能从文件中的信息中学习，以提供对丢失令牌的改进预测。与动态评估不同，这种公式允许我们选择更多的目标信息（支持令牌）进行适应，即文件中的目标漏洞之前和之后。我们考虑一种称为行级维护的评估设置，旨在反映 IDE 中代码自动完成的下游任务。利用元学习方面的最新进展，例如一阶 MAML 和 Reptile，我们在大规模 Java GitHub 语料库的实验中证明了与其他适应基线（包括动态评估）相比性能有所提高。此外，我们的分析表明，与非自适应基线相比，我们的方法将标识符和文字的性能分别提高了 44\% 和 15\%。与包括动态评估在内的其他适应基线相比。此外，我们的分析表明，与非自适应基线相比，我们的方法将标识符和文字的性能分别提高了 44\% 和 15\%。与包括动态评估在内的其他适应基线相比。此外，我们的分析表明，与非自适应基线相比，我们的方法将标识符和文字的性能分别提高了 44\% 和 15\%。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文