当前位置: X-MOL 学术arXiv.cs.PL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Highly Recursive Input Grammars
arXiv - CS - Programming Languages Pub Date : 2021-08-30 , DOI: arxiv-2108.13340
Neil Kulkarni, Caroline Lemieux, Koushik Sen

This paper presents Arvada, an algorithm for learning context-free grammars from a set of positive examples and a Boolean-valued oracle. Arvada learns a context-free grammar by building parse trees from the positive examples. Starting from initially flat trees, Arvada builds structure to these trees with a key operation: it bubbles sequences of sibling nodes in the trees into a new node, adding a layer of indirection to the tree. Bubbling operations enable recursive generalization in the learned grammar. We evaluate Arvada against GLADE and find it achieves on average increases of 4.98x in recall and 3.13x in F1 score, while incurring only a 1.27x slowdown and requiring only 0.87x as many calls to the oracle. Arvada has a particularly marked improvement over GLADE on grammars with highly recursive structure, like those of programming languages.

中文翻译:

学习高度递归的输入语法

本文介绍了 Arvada,这是一种从一组正例和一个布尔值预言机中学习上下文无关语法的算法。Arvada 通过从正例构建解析树来学习上下文无关语法。从最初的扁平树开始,Arvada 通过一个关键操作为这些树构建结构:它将树中的兄弟节点序列冒泡到一个新节点中,为树添加一个间接层。冒泡操作可以在学习的语法中进行递归泛化。我们针对 GLADE 评估了 Arvada,发现它的召回率平均提高了 4.98 倍,F1 得分提高了 3.13 倍,同时仅导致 1.27 倍的减速,并且只需要 0.87 倍的预言机调用。Arvada 在具有高度递归结构的语法(如编程语言的语法)方面比 GLADE 有特别显着的改进。
更新日期:2021-08-31
down
wechat
bug