Formalising and implementing Boost POSIX regular expression matching,Theoretical Computer Science

当前位置： X-MOL 学术 › Theor. Comput. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Formalising and implementing Boost POSIX regular expression matching
Theoretical Computer Science ( IF 0.9 ) Pub Date : 2021-01-08 , DOI: 10.1016/j.tcs.2021.01.010
Martin Berglund , Willem Bester , Brink van der Merwe

Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, analyze the complexity of regular expression matching when using Boost semantics, and provide efficient algorithms for both online and multipass matching.

中文翻译：

正式化和实现Boost POSIX正则表达式匹配

Perl兼容的正则表达式匹配器通常表现出最左上角贪婪语义的某些变体，而符合posix标准的那些则被规定为最左上最长的语义。但是，posix标准尚有解释的余地，Fowler和Kuklewicz已做实验工作以确认各种posix匹配器之间的差异。Boost库对posix有一个有趣的看法标准，它使最左边的匹配最大化，而不是针对正则表达式模式的子表达式，而是针对捕获组。在我们的工作中，我们提供了Boost语义的第一个形式化，分析了使用Boost语义时正则表达式匹配的复杂性，并为在线和多遍匹配提供了有效的算法。

更新日期：2021-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11