当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads
Nature Biotechnology ( IF 33.1 ) Pub Date : 2020-12-07 , DOI: 10.1038/s41587-020-0719-5
David Porubsky 1 , Peter Ebert 2 , Peter A Audano 1 , Mitchell R Vollger 1 , William T Harvey 1 , Pierre Marijon 2 , Jana Ebler 2 , Katherine M Munson 1 , Melanie Sorensen 1 , Arvis Sulovari 1 , Marina Haukness 3 , Maryam Ghareghani 2, 4 , , Peter M Lansdorp 5, 6 , Benedict Paten 3 , Scott E Devine 7 , Ashley D Sanders 8 , Charles Lee 9, 10, 11 , Mark J P Chaisson 12 , Jan O Korbel 8 , Evan E Eichler 1, 13 , Tobias Marschall 2
Affiliation  

Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing1,2 with continuous long-read or high-fidelity3 sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms.



中文翻译:


使用单细胞链测序和长读长进行完全定相的人类基因组组装,无需亲本数据



人类基因组通常被组装为缺乏亲本单倍型信息的共有序列。在这里,我们描述了二倍体从头基因组组装的无参考工作流程,该工作流程将单细胞链测序1,2的染色体范围定相和支架功能与连续长读长或高保真3测序数据相结合。采用这种策略,我们在缺乏亲代数据的情况下,为波多黎各血统个体(HG00733)的每个单倍型产生了完全定相的从头基因组组装。组装准确(质量值 > 40)、高度连续(重叠群 N50 > 23 Mbp),转换错误率低 (0.17%),提供完全定相的单核苷酸变体、插入缺失和结构变体。 Oxford Nanopore Technologies 和 Pacific Biosciences 定相组装的比较确定了 154 个区域,这些区域是重叠群断裂的优先位点,无论测序技术或定相算法如何。

更新日期:2020-12-07
down
wechat
bug