Purely functional GLL parsing,Journal of Computer Languages

当前位置： X-MOL 学术 › J. Comput. Lang. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Purely functional GLL parsing
Journal of Computer Languages ( IF 1.7 ) Pub Date : 2020-02-10 , DOI: 10.1016/j.cola.2020.100945
L. Thomas van Binsbergen , Elizabeth Scott , Adrian Johnstone

Generalised parsing has become increasingly important in the context of software language design and several compiler generators and language workbenches have adopted generalised parsing algorithms such as GLR and GLL. The original GLL parsing algorithms are described in low-level pseudo-code as the output of a parser generator. This paper explains GLL parsing differently, defining the FUN-GLL algorithm as a collection of pure, mathematical functions and focussing on the logic of the algorithm by omitting implementation details. In particular, the data structures are modelled by abstract sets and relations rather than specialised implementations. The description is further simplified by omitting lookahead and adopting the binary subtree representation of derivations to avoid the clerical overhead of graph construction.

Conventional parser combinators inherit the drawbacks from the recursive descent algorithms they implement. Based on FUN-GLL, this paper defines generalised parser combinators that overcome these problems. The algorithm is described in the same notation and style as FUN-GLL and uses the same data structures. Both algorithms are explained as a generalisation of basic recursive descent algorithms. The generalised parser combinators of this paper have several advantages over combinator libraries that generate internal grammars. For example, with the generalised parser combinators it is possible to parse larger permutation phrases and to write parsers for languages that are not context-free.

The ‘BNF combinator library’ is built around the generalised parser combinators. With the library, embedded and executable syntax specifications are written. The specifications contain semantic actions for interpreting programs and constructing syntax trees. The library takes advantage of Haskell’s type-system to type-check semantic actions and Haskell’s abstraction mechanism enables ‘reuse through abstraction’. The practicality of the library is demonstrated by running parsers obtained from the syntax descriptions of several software languages.

中文翻译：

纯功能GLL解析

通用解析在软件语言设计的背景下变得越来越重要，并且一些编译器生成器和语言工作台已经采用了通用解析算法，例如GLR和GLL。原始的GLL解析算法以低级伪代码描述为解析器生成器的输出。本文介绍了GLL的不同解析方式，将FUN-GLL算法定义为纯数学函数的集合，并通过省略实现细节来关注算法的逻辑。特别是，数据结构是通过抽象集和关系而不是专门的实现来建模的。通过省略前瞻并采用导数的二进制子树表示来避免图形构造的文书开销，从而进一步简化了描述。

传统的解析器组合器从它们实现的递归下降算法继承了缺点。基于FUN-GLL，本文定义了克服这些问题的广义解析器组合器。该算法以与FUN-GLL相同的符号和样式进行描述，并使用相同的数据结构。两种算法都被解释为基本递归下降算法的概括。与生成内部语法的组合器库相比，本文的广义解析器组合器具有多个优点。例如，使用广义解析器组合器，可以解析更大的排列短语，并为非上下文无关的语言编写解析器。

“ BNF组合器库”围绕广义的解析器组合器构建。使用该库，可以编写嵌入式和可执行的语法规范。规范包含用于解释程序和构造语法树的语义动作。该库利用Haskell的类型系统对语义动作进行类型检查，Haskell的抽象机制实现“通过抽象重用”。通过运行从几种软件语言的语法描述中获得的解析器，可以证明该库的实用性。

更新日期：2020-02-10

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文