当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hap10: reconstructing accurate and long polyploid haplotypes using linked reads.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-06-18 , DOI: 10.1186/s12859-020-03584-5
Sina Majidian 1 , Mohammad Hossein Kahaei 1 , Dick de Ridder 2
Affiliation  

Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato.

中文翻译:


Hap10:使用链接读取重建准确且长的多倍体单倍型。



单倍型信息对于许多遗传和基因组分析至关重要,包括人类、动物和植物的基因型-表型关联。单倍型组装是一种根据 DNA 测序读数重建单倍型的方法。随着新测序技术的出现,需要新的算法来确保长而准确的单倍型。虽然一些链接读单倍型组装算法可用于二倍体基因组,但据我们所知,尚未针对专门利用链接读的多倍体提出算法。提出了第一个专为从多倍体基因组生成的链接读取而设计的单倍型算法,该算法建立在典型的短读单倍型方法 SDhaP 的基础上。使用输入的对齐读数和称为变体,提取单倍型相关信息。接下来,具有相同条形码的读数被组合以产生分子特异性片段。然后,这些片段被聚集成强连接的组件,然后用作单倍型组装核心的输入,以估计准确且长的单倍型。 Hap10 是一种使用链接读取对多倍体基因组进行单倍型组装的新颖算法。在许多模拟场景中评估了算法的性能,并在甘薯的真实数据集上证明了其适用性。
更新日期:2020-06-18
down
wechat
bug