当前位置:
X-MOL 学术
›
arXiv.cs.DS
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
String Indexing with Compressed Patterns
arXiv - CS - Data Structures and Algorithms Pub Date : 2019-09-26 , DOI: arxiv-1909.11930 Philip Bille and Inge Li G{\o}rtz and Teresa Anna Steiner
arXiv - CS - Data Structures and Algorithms Pub Date : 2019-09-26 , DOI: arxiv-1909.11930 Philip Bille and Inge Li G{\o}rtz and Teresa Anna Steiner
Given a string $S$ of length $n$, the classic string indexing problem is to
preprocess $S$ into a compact data structure that supports efficient subsequent
pattern queries. In this paper we consider the basic variant where the pattern
is given in compressed form and the goal is to achieve query time that is fast
in terms of the compressed size of the pattern. This captures the common
client-server scenario, where a client submits a query and communicates it in
compressed form to a server. Instead of the server decompressing the query
before processing it, we consider how to efficiently process the compressed
query directly. Our main result is a novel linear space data structure that
achieves near-optimal query time for patterns compressed with the classic
Lempel-Ziv compression scheme. Along the way we develop several data structural
techniques of independent interest, including a novel data structure that
compactly encodes all LZ77 compressed suffixes of a string in linear space and
a general decomposition of tries that reduces the search time from logarithmic
in the size of the trie to logarithmic in the length of the pattern.
中文翻译:
使用压缩模式的字符串索引
给定长度为 $n$ 的字符串 $S$,经典的字符串索引问题是将 $S$ 预处理为一个紧凑的数据结构,以支持高效的后续模式查询。在本文中,我们考虑以压缩形式给出模式的基本变体,目标是在模式的压缩大小方面实现快速的查询时间。这捕获了常见的客户端 - 服务器场景,其中客户端提交查询并以压缩形式将其传达给服务器。不是服务器在处理查询之前解压缩查询,而是考虑如何直接有效地处理压缩查询。我们的主要结果是一种新颖的线性空间数据结构,它为使用经典 Lempel-Ziv 压缩方案压缩的模式实现了接近最佳的查询时间。
更新日期:2020-09-30
中文翻译:
使用压缩模式的字符串索引
给定长度为 $n$ 的字符串 $S$,经典的字符串索引问题是将 $S$ 预处理为一个紧凑的数据结构,以支持高效的后续模式查询。在本文中,我们考虑以压缩形式给出模式的基本变体,目标是在模式的压缩大小方面实现快速的查询时间。这捕获了常见的客户端 - 服务器场景,其中客户端提交查询并以压缩形式将其传达给服务器。不是服务器在处理查询之前解压缩查询,而是考虑如何直接有效地处理压缩查询。我们的主要结果是一种新颖的线性空间数据结构,它为使用经典 Lempel-Ziv 压缩方案压缩的模式实现了接近最佳的查询时间。