当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nanopore sequencing and assembly of a human genome with ultra-long reads.
Nature Biotechnology ( IF 33.1 ) Pub Date : 2018-04-01 , DOI: 10.1038/nbt.4060
Miten Jain 1 , Sergey Koren 2 , Karen H Miga 1 , Josh Quick 3 , Arthur C Rand 1 , Thomas A Sasani 4, 5 , John R Tyson 6 , Andrew D Beggs 7 , Alexander T Dilthey 2 , Ian T Fiddes 1 , Sunir Malla 8 , Hannah Marriott 8 , Tom Nieto 7 , Justin O'Grady 9 , Hugh E Olsen 1 , Brent S Pedersen 4, 5 , Arang Rhie 2 , Hollian Richardson 9 , Aaron R Quinlan 4, 5, 10 , Terrance P Snutch 6 , Louise Tee 7 , Benedict Paten 1 , Adam M Phillippy 2 , Jared T Simpson 11, 12 , Nicholas J Loman 3 , Matthew Loose 8
Affiliation  

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ∼3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 ∼6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.

中文翻译:


具有超长读长的人类基因组的纳米孔测序和组装。



我们报告了使用 MinION (Oxford Nanopore Technologies) 纳米孔测序仪对人类 GM12878 Utah/Ceph 细胞系的参考基因组进行测序和组装。生成了 91.2 GB 的序列数据,代表~30×理论覆盖率。基于参考的比对能够检测大的结构变异和表观遗传修饰。纳米孔读数的从头组装单独产生了连续组装(NG50∼3 Mb)。我们开发了一种协议来生成超长读取(N50 > 100 kb,读取长度高达 882 kb)。对这些超长读段进行额外的 5 倍覆盖,使组装连续性增加了一倍以上 (NG50 ∼6.4 Mb)。最终组装的基因组大小为28.67亿个碱基,覆盖了参考的85.8%。合并互补短读长测序数据后,组装准确度超过 99.8%。超长读取能够实现 4 Mb 主要组织相容性复合体 (MHC) 基因座的整体组装和定相、端粒重复长度的测量以及参考人类基因组组装 GRCh38 中间隙的闭合。
更新日期:2018-01-29
down
wechat
bug