当前位置: X-MOL 学术Theor. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
When a dollar makes a BWT
Theoretical Computer Science ( IF 0.9 ) Pub Date : 2021-01-08 , DOI: 10.1016/j.tcs.2021.01.008
Sara Giuliani , Zsuzsanna Lipták , Francesco Masillo , Romeo Rizzi

The Burrows-Wheeler-Transform (BWT) is a reversible string transformation which plays a central role in text compression and is fundamental in many modern bioinformatics applications. The BWT is a permutation of the characters, which is in general better compressible and allows to answer several different query types more efficiently than the original string.

It is easy to see that not every string is a BWT image, and exact characterizations of BWT images are known. We investigate a related combinatorial question. In many applications, a sentinel character $ is added to mark the end of the string, and thus the BWT of a string ending with $ contains exactly one $-character. Given a string w, we ask in which positions, if any, the $-character can be inserted to turn w into the BWT image of a word ending with $. We show that this depends only on the standard permutation of w and present a O(nlogn)-time algorithm for identifying all such positions, improving on the naive quadratic time algorithm. We also give a combinatorial characterization of such positions and develop bounds on their number and value. This is an extended version of [Giuliani et al. ICTCS 2019].



中文翻译:

当一美元成为BWT时

Burrows-Wheeler-Transform(BWT)是可逆的字符串转换,它在文本压缩中起着核心作用,并且是许多现代生物信息学应用程序中的基础。BWT是字符的排列,通常可以更好地压缩,并且比原始字符串更有效地回答几种不同的查询类型。

很容易看出,并不是每个字符串都是BWT图像,并且BWT图像的确切特征是已知的。我们调查了一个相关的组合问题。在许多应用程序中,添加了定点字符$来标记字符串的结尾,因此以$结尾的字符串的BWT恰好包含一个$字符。给定字符串w,我们询问可以在$字符的哪个位置插入$字符,以将w转换为以$结尾的单词的BWT图像。我们证明这仅取决于w的标准排列,并给出aØñ日志ñ识别所有此类位置的实时算法,改进了朴素的二次时间算法。我们还对此类职位进行了组合描述,并确定了其数量和价值的界限。这是[Giuliani等人的扩展版本。ICTCS 2019]。

更新日期:2021-01-22
down
wechat
bug