当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
AsmMix: A pipeline for high quality diploid de novo assembly
bioRxiv - Genomics Pub Date : 2021-01-18 , DOI: 10.1101/2021.01.15.426893
Pei Wu , Chao Liu , Ou Wang , Xia Zhao , Fang Chen , Xiaofang Cheng , Hongmei Zhu

In this paper, we report a pipeline, AsmMix, which is capable of producing both contiguous and high-quality diploid genomes. The pipeline consists of two steps. In the first step, two sets of assemblies are generated: one is based on co-barcoded reads, which are highly accurate and haplotype-resolved but contain many gaps, the other assembly is based on single-molecule sequencing reads, which is contiguous but error-prone. In the second step, those two sets of assemblies are compared and integrated into a haplotype-resolved assembly with fewer errors. We test our pipeline using a dataset of human genome NA24385, perform variant calling from those assemblies and then compare against GIAB Benchmark. We show that AsmMix pipeline could produce highly contiguous, accurate, and haplotype-resolved assemblies. Especially the assembly mixing process could effectively reduce small-scale errors in the long read assembly.

中文翻译:

AsmMix:高质量的二倍体从头组装流水线

在本文中,我们报告了一个名为AsmMix的管道,该管道能够产生连续和高质量的二倍体基因组。管道包括两个步骤。第一步,将生成两套程序集:一套基于高度精确且具有单倍型解析但有许多空位的共条形码读取,另一套则基于单分子测序读取,这是连续的,但容易出错。在第二步中,将这两组程序集进行比较,并以更少的错误将其集成到单倍型解析的程序集中。我们使用人类基因组NA24385的数据集测试管道,从这些程序集中执行变体调用,然后与GIAB Benchmark进行比较。我们证明了AsmMix管道可以产生高度连续,准确和单倍型解析的程序集。
更新日期:2021-01-18
down
wechat
bug