当前位置: X-MOL 学术J. Comput. Syst. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Grammar-compressed indexes with logarithmic search time
Journal of Computer and System Sciences ( IF 1.1 ) Pub Date : 2020-12-30 , DOI: 10.1016/j.jcss.2020.12.001
Francisco Claude , Gonzalo Navarro , Alejandro Pacheco

Let a text T[1..n] be the only string generated by a context-free grammar with g (terminal and nonterminal) symbols, and of size G (measured as the sum of the lengths of the right-hand sides of the rules). Such a grammar, called a grammar-compressed representation of T, can be encoded using GlgG bits. We introduce the first grammar-compressed index that uses O(Glgn) bits (precisely, Glgn+(2+ϵ)Glgg for any constant ϵ>0) and can find the occ occurrences of patterns P[1..m] in time O((m2+occ)lgG). We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections.



中文翻译:

具有对数搜索时间的语法压缩索引

让文字 Ť[1个ñ]是由上下文无关的语法生成的唯一字符串,具有g(终止符和非终止符)符号,且大小为G(以规则右手边的长度之和来衡量)。这种语法,称为T语法压缩表示形式,可以使用GlgG位。我们引入第一语法压缩指数使用ØGlgñ 位(准确地说, Glgñ+2+ϵGlgG 对于任何常数 ϵ>0)并可以找到模式的occ出现P[1个] 及时 Ø2+ØCClgG。我们对高度重复的文本集实施索引,并与现有技术进行比较证明其实用性。

更新日期:2020-12-31
down
wechat
bug