当前位置: X-MOL 学术Inf. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A compressed dynamic self-index for highly repetitive text collections
Information and Computation ( IF 0.8 ) Pub Date : 2020-01-15 , DOI: 10.1016/j.ic.2020.104518
Takaaki Nishimoto , Yoshimasa Takabatake , Yasuo Tabei

We present a novel compressed dynamic self-index for highly repetitive text collections. ESP-index is a static self-index of this type and has a large disadvantage of slow pattern search for short patterns. We obtain faster pattern search by leveraging the idea behind a truncated suffix tree (TST) to develop the first compressed dynamic self-index, called TST-index, that supports not only fast pattern search but also dynamic update operations for highly repetitive texts. Experiments with a benchmark dataset show that the pattern search performance of the TST-index is significantly better than that of ESP-index for short patterns.



中文翻译:

压缩的动态自我索引,用于高度重复的文本集

我们为高度重复的文本集提供了一种新颖的压缩动态自索引。ESP索引是这种类型的静态自索引,并且具有慢模式搜索短模式的巨大缺点。通过利用截断后缀树(TST)背后的思想来开发第一个压缩的动态自索引(称为TST-index),我们可以获得更快的模式搜索,该索引不仅支持快速模式搜索,而且还支持针对高度重复文本的动态更新操作。使用基准数据集进行的实验表明,对于短模式,TST索引的模式搜索性能明显优于ESP索引。

更新日期:2020-01-15
down
wechat
bug