当前位置: X-MOL 学术arXiv.cs.FL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Purely Regular Approach to Non-Regular Core Spanners
arXiv - CS - Formal Languages and Automata Theory Pub Date : 2020-10-26 , DOI: arxiv-2010.13442
Markus L. Schmid and Nicole Schweikardt

The regular spanners (characterised by vset-automata) are closed under the algebraic operations of union, join and projection, and have desirable algorithmic properties. The core spanners (introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (PODS 2013, JACM 2015) as a formalisation of the core functionality of the query language AQL used in IBM's SystemT) additionally need string equality selections and it has been shown by Freydenberger and Holldack (ICDT 2016, Theory of Computing Systems 2018) that this leads to high complexity and even undecidability of the typical problems in static analysis and query evaluation. We propose an alternative approach to core spanners: by incorporating the string-equality selections directly into the regular language that represents the underlying regular spanner (instead of treating it as an algebraic operation on the table extracted by the regular spanner), we obtain a fragment of core spanners that, while having slightly weaker expressive power than the full class of core spanners, arguably still covers the intuitive applications of string equality selections for information extraction and has much better upper complexity bounds of the typical problems in static analysis and query evaluation.

中文翻译:

非规则核心扳手的纯粹规则方法

常规扳手(以 vset-automata 为特征)在联合、连接和投影的代数运算下是封闭的,并且具有理想的算法特性。核心生成器(由 Fagin、Kimelfeld、Reiss 和 Vansummeren 引入(PODS 2013,JACM 2015)作为 IBM 的 SystemT 中使用的查询语言 AQL 的核心功能的形式化)还需要字符串相等选择,并且 Freydenberger 已经展示了这一点和 Holldack (ICDT 2016, Theory of Computing Systems 2018) 认为这导致静态分析和查询评估中典型问题的高度复杂性甚至不可判定性。我们提出了一种替代核心扳手的方法:
更新日期:2020-10-27
down
wechat
bug