Journal of Computer and System Sciences ( IF 1.494 ) Pub Date : 2020-11-18 , DOI: 10.1016/j.jcss.2020.11.002
Djamal Belazzougui; Manuel Cáceres; Travis Gagie; Paweł Gawrychowski; Juha Kärkkäinen; Gonzalo Navarro; Alberto Ordóñez; Simon J. Puglisi; Yasuo Tabei

Let string $S\left[1..n\right]$ be parsed into z phrases by the Lempel-Ziv algorithm. The corresponding compression algorithm encodes S in $\mathcal{O}\left(z\right)$ space, but it does not support random access to S. We introduce a data structure, the block tree, that represents S in $\mathcal{O}\left(z\mathrm{log}\left(n/z\right)\right)$ space and extracts any symbol of S in time $\mathcal{O}\left(\mathrm{log}\left(n/z\right)\right)$, among other space-time tradeoffs. The structure also supports other queries that are useful for building compressed data structures on top of S. Further, block trees can be built in linear time and in a scalable manner. Our experiments show that block trees offer relevant space-time tradeoffs compared to other compressed string representations for highly repetitive strings.

down
wechat
bug