当前位置: X-MOL 学术Theor. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Framework For Designing Space-Efficient Dictionaries for Parameterized and Order-Preserving Matching
Theoretical Computer Science ( IF 0.9 ) Pub Date : 2020-11-20 , DOI: 10.1016/j.tcs.2020.11.036
Arnab Ganguly , Wing-Kai Hon , Kunihiko Sadakane , Rahul Shah , Sharma V. Thankachan , Yilin Yang

Let P be a collection of d patterns {P1,P2,,Pd} of total length n characters, which are chosen from an alphabet Σ of size σ. Given a text T (over Σ), the dictionary indexing problem is to create a data structure using which we can report all positions j (called occurrences) where at least one of the patterns PiP is a match with the same-length substring of T that starts at j. We consider this problem under the following definitions of matching.

Parameterized Matching: The characters of Σ are partitioned into static characters and parameterized characters. Two equal length strings S and S are a parameterized match iff the static characters match exactly, and there exists a one-to-one function which renames the parameterized characters in S to those in S.

Order-Preserving Matching: The alphabet Σ is ordered. Two equal length strings S and S are an order-preserving match iff for any two integers i,j[1,|S|], S[i]S[j]S[i]S[j], where ≺ denotes the precedence order in Σ.

Let ε>0 be an arbitrarily small constant. For parameterized matching, we first present a compact O(nlogσ+dlogn)-bit index that reports all occ occurrences in O(|T|(logσ+logσn)+occ) time, and then a succinct nlogσ+o(nlogσ)+O(dlogn)-bit index that reports all occ occurrences in O(|T|(logσ+logεnlogσn)+occ) time. For order-preserving matching, we present indexes of the same sizes, but with slightly increased query time.



中文翻译:

设计空间高效字典以进行参数化和保留顺序匹配的框架

Pd模式的集合{P1个P2Pd}总长度为n个字符的字符,可从大小为σ的字母Σ中选择。给定文本T(超过Σ),字典索引问题是创建一个数据结构,通过该数据结构,我们可以报告所有位置j(称为出现)的位置,其中至少有一种模式P一世P是与以j开头的T的相同长度子字符串匹配的项。我们在以下匹配定义下考虑此问题。

参数化匹配: Σ的字符分为静态字符和参数化字符。两个相等长度的字符串S小号是一个参数化匹配当且仅当静态字符完全匹配,并且存在一个一到一个功能,其将重命名参数化的字符小号的那些小号

保留顺序匹配:字母Σ已排序。两个相等长度的字符串S小号 是任何两个整数的保留顺序匹配iff 一世Ĵ[1个|小号|]小号[一世]小号[Ĵ]小号[一世]小号[Ĵ],其中≺表示Σ中的优先顺序。

ε>0是一个任意小的常数。对于参数化匹配,我们首先提出一个紧凑的Øñ日志σ+d日志ñ位索引,报告所有发生在其中的occØ|Ť|日志σ+日志σñ+ØCC 时间,然后简洁 ñ日志σ+Øñ日志σ+Ød日志ñ位索引,报告所有发生在其中的occØ|Ť|日志σ+日志εñ日志σñ+ØCC时间。对于保留订单的匹配,我们提供相同大小的索引,但查询时间略有增加。

更新日期:2020-11-21
down
wechat
bug