当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Chromosome-scale, haplotype-resolved assembly of human genomes
Nature Biotechnology ( IF 46.9 ) Pub Date : 2020-12-07 , DOI: 10.1038/s41587-020-0711-0
Shilpa Garg 1, 2, 3 , Arkarachai Fungtammasan 4 , Andrew Carroll 5 , Mike Chou 1 , Anthony Schmitt 6 , Xiang Zhou 6 , Stephen Mac 6 , Paul Peluso 7 , Emily Hatas 7 , Jay Ghurye 8 , Jared Maguire 8 , Medhat Mahmoud 9 , Haoyu Cheng 2, 3 , David Heller 10 , Justin M Zook 11 , Tobias Moemke 12 , Tobias Marschall 12, 13 , Fritz J Sedlazeck 9 , John Aach 1 , Chen-Shan Chin 4 , George M Church 1 , Heng Li 2, 3
Affiliation  

Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98–99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.



中文翻译:

人类基因组的染色体规模、单倍型解析组装

单倍型解析或分阶段基因组组装提供了基因组及其复杂遗传变异的完整图景。然而,当前的分阶段组装算法要么不生成染色体规模的定相,要么需要谱系信息,这限制了它们的应用。我们提出了一种名为二倍体组装 (DipAsm) 的方法,该方法使用单个个体的长、准确读数和远程构象数据,在 1 天内生成染色体规模的分阶段组装。应用于四个公共人类基因组,PGP1、HG002、NA12878 和 HG00733,DipAsm 产生了单倍型解析组装,其最小重叠群长度需要覆盖 50% 的已知基因组 (NG50) 高达 25 Mb,并且定相约 99.5% 的杂合位点位于98-99% 的准确率,在连续性和分阶段完整性方面均优于其他方法。我们证明了染色体规模分阶段组装对于发现结构变异 (SV) 的重要性,包括数千个新的转座子插入,以及高度多态性和医学上重要的区域,例如人类白细胞抗原 (HLA) 和杀伤细胞免疫球蛋白样受体 (KIR) 区域。DipAsm 将促进高质量的精准医学以及个体单倍型变异和种群多样性的研究。

更新日期:2020-12-07
down
wechat
bug