On (co-lex) Ordering Automata,arXiv - CS - Formal Languages and Automata Theory

当前位置： X-MOL 学术 › arXiv.cs.FL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On (co-lex) Ordering Automata
arXiv - CS - Formal Languages and Automata Theory Pub Date : 2021-06-04 , DOI: arxiv-2106.02309
Giovanna D'Agostino, Nicola Cotumaccio, Alberto Policriti, Nicola Prezza

The states of a deterministic finite automaton A can be identified with collections of words in Pf(L(A)) -- the set of prefixes of words belonging to the regular language accepted by A. But words can be ordered and among the many possible orders a very natural one is the co-lexicographic one. Such naturalness stems from the fact that it suggests a transfer of the order from words to the automaton's states. In a number of papers automata admitting a total ordering of states coherent with the ordering of the set of words reaching them have been proposed. Such class of ordered automata -- the Wheeler automata -- turned out to be efficiently stored/searched using an index. Unfortunately not all automata can be totally ordered as previously outlined. However, automata can always be partially ordered and an intrinsic measure of their complexity can be defined and effectively determined, as the minimum width of one of their admissible partial orders. As shown in previous works, this new concept of width of an automaton has useful consequences in the fields of graph compression, indexing data structures, and automata theory. In this paper we prove that a canonical, minimum-width, partially-ordered automaton accepting a language L -- dubbed the Hasse automaton H of L -- can be exhibited. H provides, in a precise sense, the best possible way to (partially) order the states of any automaton accepting L, as long as we want to maintain an operational link with the (co-lexicographic) order of Pf(L(A)). Using H we prove that the width of the language can be effectively computed from the minimum automaton recognizing the language. Finally, we explore the relationship between two (often conflicting) objectives: minimizing the width and minimizing the number of states of an automaton.

中文翻译：

在（co-lex）排序自动机上

确定性有限自动机 A 的状态可以用 Pf(L(A)) 中的词集合来识别——属于 A 接受的常规语言的词前缀集。命令一个非常自然的一个是co-lexicographic 一个。这种自然性源于这样一个事实，即它表明将顺序从单词转移到自动机的状态。在许多论文中，已经提出了允许状态的总排序与到达它们的单词集的排序相一致的自动机。这类有序自动机——惠勒自动机——被证明可以使用索引有效地存储/搜索。不幸的是，并非所有自动机都可以像前面概述的那样完全排序。然而，自动机总是可以是偏序的，并且可以定义并有效地确定其复杂性的内在度量，作为其允许偏序之一的最小宽度。正如之前的工作所示，这种自动机宽度的新概念在图压缩、索引数据结构和自动机理论领域具有有用的影响。在本文中，我们证明了一个接受语言 L 的规范的、最小宽度的、偏序的自动机——被称为 L 的 Hasse 自动机 H——可以被展示出来。在精确的意义上，H 提供了对任何接受 L 的自动机的状态进行（部分）排序的最佳方法，只要我们希望与 Pf(L(A) ）。使用 H 我们证明了语言的宽度可以从识别语言的最小自动机有效地计算出来。最后，我们探讨了两个（通常是相互冲突的）目标之间的关系：最小化自动机的宽度和最小化状态数。

更新日期：2021-06-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文