当前位置: X-MOL 学术arXiv.cs.FL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Absent Subsequences in Words
arXiv - CS - Formal Languages and Automata Theory Pub Date : 2021-08-31 , DOI: arxiv-2108.13968
Maria Kosche, Tore Koß, Florin Manea, Stefan Siemer

An absent factor of a string $w$ is a string $u$ which does not occur as a contiguous substring (a.k.a. factor) inside $w$. We extend this well-studied notion and define absent subsequences: a string $u$ is an absent subsequence of a string $w$ if $u$ does not occur as subsequence (a.k.a. scattered factor) inside $w$. Of particular interest to us are minimal absent subsequences, i.e., absent subsequences whose every subsequence is not absent, and shortest absent subsequences, i.e., absent subsequences of minimal length. We show a series of combinatorial and algorithmic results regarding these two notions. For instance: we give combinatorial characterisations of the sets of minimal and, respectively, shortest absent subsequences in a word, as well as compact representations of these sets; we show how we can test efficiently if a string is a shortest or minimal absent subsequence in a word, and we give efficient algorithms computing the lexicographically smallest absent subsequence of each kind; also, we show how a data structure for answering shortest absent subsequence-queries for the factors of a given string can be efficiently computed.

中文翻译:

词中不存在的子序列

字符串 $w$ 的缺失因子是一个字符串 $u$,它不会作为 $w$ 内的连续子字符串(又名因子)出现。我们扩展了这个经过充分研究的概念并定义了不存在的子序列:如果 $u$ 不作为子序列(又名分散因子)出现在 $w$ 中,则字符串 $u$ 是字符串 $w$ 的缺失子序列。我们特别感兴趣的是最小缺席子序列,即每个子序列都不缺席的缺席子序列,以及最短缺席子序列,即最小长度的缺席子序列。我们展示了关于这两个概念的一系列组合和算法结果。例如:我们分别给出单词中最小和最短缺失子序列集合的组合特征,以及这些集合的紧凑表示;我们展示了如何有效地测试字符串是否是单词中最短或最小的缺失子序列,并且我们给出了计算每种类型的字典序最小缺失子序列的高效算法;此外,我们展示了如何有效地计算用于回答给定字符串的因子的最短不存在子序列查询的数据结构。
更新日期:2021-09-01
down
wechat
bug