当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Kermit: linkage map guided long read assembly.
Algorithms for Molecular Biology ( IF 1 ) Pub Date : 2019-03-20 , DOI: 10.1186/s13015-019-0143-x
Riku Walve 1 , Pasi Rastas 2 , Leena Salmela 1
Affiliation  

BACKGROUND With long reads getting even longer and cheaper, large scale sequencing projects can be accomplished without short reads at an affordable cost. Due to the high error rates and less mature tools, de novo assembly of long reads is still challenging and often results in a large collection of contigs. Dense linkage maps are collections of markers whose location on the genome is approximately known. Therefore they provide long range information that has the potential to greatly aid in de novo assembly. Previously linkage maps have been used to detect misassemblies and to manually order contigs. However, no fully automated tools exist to incorporate linkage maps in assembly but instead large amounts of manual labour is needed to order the contigs into chromosomes. RESULTS We formulate the genome assembly problem in the presence of linkage maps and present the first method for guided genome assembly using linkage maps. Our method is based on an additional cleaning step added to the assembly. We show that it can simplify the underlying assembly graph, resulting in more contiguous assemblies and reducing the amount of misassemblies when compared to de novo assembly. CONCLUSIONS We present the first method to integrate linkage maps directly into genome assembly. With a modest increase in runtime, our method improves contiguity and correctness of genome assembly.

中文翻译:

Kermit:链接图引导长读组装。

背景技术随着长读长变得更长和更便宜,可以以负担得起的成本在没有短读长的情况下完成大规模测序项目。由于高错误率和不太成熟的工具,长读取的从头组装仍然具有挑战性,并且通常会导致大量重叠群。密集连锁图是其在基因组上的位置大致已知的标记的集合。因此,它们提供的远程信息有可能极大地帮助从头组装。以前,连锁图已用于检测错误组装和手动排序重叠群。然而,不存在将连锁图谱整合到组装中的全自动工具,而是需要大量的手工劳动来将重叠群排序到染色体中。结果 我们在存在连锁图谱的情况下制定了基因组组装问题,并提出了第一种使用连锁图谱引导基因组组装的方法。我们的方法基于添加到组件中的额外清洁步骤。我们表明,与从头组装相比,它可以简化底层组装图,从而产生更连续的组装并减少错误组装的数量。结论 我们提出了第一种将连锁图谱直接整合到基因组组装中的方法。随着运行时间的适度增加,我们的方法提高了基因组组装的连续性和正确性。与从头组装相比,导致更连续的组装并减少错误组装的数量。结论 我们提出了第一种将连锁图谱直接整合到基因组组装中的方法。随着运行时间的适度增加,我们的方法提高了基因组组装的连续性和正确性。与从头组装相比,导致更连续的组装并减少错误组装的数量。结论 我们提出了第一种将连锁图谱直接整合到基因组组装中的方法。随着运行时间的适度增加,我们的方法提高了基因组组装的连续性和正确性。
更新日期:2019-11-01
down
wechat
bug