当前位置: X-MOL 学术Nat. Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Constructing telomere-to-telomere diploid genome by polishing haploid nanopore-based assembly
Nature Methods ( IF 48.0 ) Pub Date : 2024-03-08 , DOI: 10.1038/s41592-023-02141-1
Joshua Casey Darian , Ritu Kundu , Ramesh Rajaby , Wing-Kin Sung

Draft genomes generated from Oxford Nanopore Technologies (ONT) long reads are known to have a higher error rate. Although existing genome polishers can enhance their quality, the error rate (including mismatches, indels and switching errors between paternal and maternal haplotypes) can be significant. Here, we develop two polishers, hypo-short and hypo-hybrid to address this issue. Hypo-short utilizes Illumina short reads to polish an ONT-based draft assembly, resulting in a high-quality assembly with low error rates and switching errors. Expanding on this, hypo-hybrid incorporates ONT long reads to further refine the assembly into a diploid representation. Leveraging on hypo-hybrid, we have created a diploid genome assembly pipeline called hypo-assembler. Hypo-assembler automates the generation of highly accurate, contiguous and nearly complete diploid assemblies using ONT long reads, Illumina short reads and optionally Hi-C reads. Notably, our solution even allows for the production of telomere-to-telomere diploid genomes with additional manual steps. As a proof of concept, we successfully assembled a fully phased telomere-to-telomere diploid genome of HG00733, achieving a quality value exceeding 50.



中文翻译:

通过抛光基于单倍体纳米孔的组装构建端粒到端粒二倍体基因组

已知由牛津纳米孔技术 (ONT) 长读长生成的基因组草图具有较高的错误率。尽管现有的基因组修饰器可以提高其质量,但错误率(包括错配、插入缺失以及父本和母本单倍型之间的转换错误)可能很高。在这里,我们开发了两种抛光机:超短型和超混合型来解决这个问题。Hypo-short 利用 Illumina 短读取来完善基于 ONT 的草图装配,从而产生具有低错误率和切换错误的高质量装配。在此基础上进行扩展,hyper-hybrid 结合了 ONT 长读,以进一步将组装细化为二倍体表示。利用hyper-hybrid,我们创建了一个称为hyper-assembler的二倍体基因组组装管道。Hypo-assembler 使用 ONT 长读段、Illumina 短读段和可选的 Hi-C 读段自动生成高度准确、连续且几乎完整的二倍体组装体。值得注意的是,我们的解决方案甚至允许通过额外的手动步骤生成端粒到端粒二倍体基因组。作为概念验证,我们成功组装了 HG00733 的完全定相端粒到端粒二倍体基因组,质量值超过 50。

更新日期:2024-03-09
down
wechat
bug