当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Break-It-Fix-It: Unsupervised Learning for Program Repair
arXiv - CS - Software Engineering Pub Date : 2021-06-11 , DOI: arxiv-2106.06600
Michihiro Yasunaga, Percy Liang

We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.

中文翻译:

Break-It-Fix-It:程序修复的无监督学习

我们考虑修复任务:给定一个评估输入质量的评论家(例如,编译器),目标是训练一个修复器,将一个坏的例子(例如,有语法错误的代码)转换成一个好的(例如,带有语法错误的代码)没有错误)。现有工作通过使用启发式(例如,丢弃标记)破坏好的示例来创建由(坏,好)对组成的训练数据。然而,在这种合成生成的数据上训练的固定器不能很好地外推到不良输入的实际分布。为了弥合这一差距,我们提出了一种新的训练方法,Break-It-Fix-It (BIFI),它有两个关键思想:(i) 我们使用评论家来检查修复器对真实错误输入的输出,并添加好的(固定的) ) 输出到训练数据,以及 (ii) 我们训练一个断路器从好的代码中生成真实的坏代码。基于这些想法,我们迭代更新断路器和固定器,同时结合使用它们来生成更多配对数据。我们在两个代码修复数据集上评估 BIFI:GitHub-Python,我们引入的一个新数据集,其目标是修复具有 AST 解析错误的 Python 代码;和 DeepFix,目标是修复带有编译器错误的 C 代码。BIFI 优于现有方法,在 GitHub-Python 上获得 90.5% 的修复准确率(+28.5%),在 DeepFix 上获得 71.7%(+5.6%)。值得注意的是,BIFI 不需要任何标记数据;我们希望它将成为各种修复任务的无监督学习的有力起点。目标是修复带有编译器错误的 C 代码。BIFI 优于现有方法,在 GitHub-Python 上获得 90.5% 的修复准确率(+28.5%),在 DeepFix 上获得 71.7%(+5.6%)。值得注意的是,BIFI 不需要任何标记数据;我们希望它将成为各种修复任务的无监督学习的有力起点。目标是修复带有编译器错误的 C 代码。BIFI 优于现有方法,在 GitHub-Python 上获得 90.5% 的修复准确率(+28.5%),在 DeepFix 上获得 71.7%(+5.6%)。值得注意的是,BIFI 不需要任何标记数据;我们希望它将成为各种修复任务的无监督学习的有力起点。
更新日期:2021-06-15
down
wechat
bug