Theoretical Computer Science ( IF 0.9 ) Pub Date : 2020-11-20 , DOI: 10.1016/j.tcs.2020.11.036 Arnab Ganguly , Wing-Kai Hon , Kunihiko Sadakane , Rahul Shah , Sharma V. Thankachan , Yilin Yang
Let be a collection of d patterns of total length n characters, which are chosen from an alphabet Σ of size σ. Given a text T (over Σ), the dictionary indexing problem is to create a data structure using which we can report all positions j (called occurrences) where at least one of the patterns is a match with the same-length substring of T that starts at j. We consider this problem under the following definitions of matching.
- •
Parameterized Matching: The characters of Σ are partitioned into static characters and parameterized characters. Two equal length strings S and are a parameterized match iff the static characters match exactly, and there exists a one-to-one function which renames the parameterized characters in S to those in .
- •
Order-Preserving Matching: The alphabet Σ is ordered. Two equal length strings S and are an order-preserving match iff for any two integers , , where ≺ denotes the precedence order in Σ.
中文翻译:
设计空间高效字典以进行参数化和保留顺序匹配的框架
让 是d模式的集合总长度为n个字符的字符,可从大小为σ的字母Σ中选择。给定文本T(超过Σ),字典索引问题是创建一个数据结构,通过该数据结构,我们可以报告所有位置j(称为出现)的位置,其中至少有一种模式是与以j开头的T的相同长度子字符串匹配的项。我们在以下匹配定义下考虑此问题。
- •
参数化匹配: Σ的字符分为静态字符和参数化字符。两个相等长度的字符串S和是一个参数化匹配当且仅当静态字符完全匹配,并且存在一个一到一个功能,其将重命名参数化的字符小号的那些。
- •
保留顺序匹配:字母Σ已排序。两个相等长度的字符串S和 是任何两个整数的保留顺序匹配iff , ,其中≺表示Σ中的优先顺序。