当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
High-resolution comparative analysis of great ape genomes
Science ( IF 44.7 ) Pub Date : 2018-06-07 , DOI: 10.1126/science.aar6343
Zev N Kronenberg 1 , Ian T Fiddes 2 , David Gordon 1, 3 , Shwetha Murali 1, 3 , Stuart Cantsilieris 1 , Olivia S Meyerson 4 , Jason G Underwood 1, 5 , Bradley J Nelson 1 , Mark J P Chaisson 1, 6 , Max L Dougherty 1 , Katherine M Munson 1 , Alex R Hastie 7 , Mark Diekhans 2 , Fereydoun Hormozdiari 8 , Nicola Lorusso 9 , Kendra Hoekzema 1 , Ruolan Qiu 1 , Karen Clark 10 , Archana Raja 1, 3 , AnneMarie E Welch 1 , Melanie Sorensen 1 , Carl Baker 1 , Robert S Fulton 11 , Joel Armstrong 2 , Tina A Graves-Lindsay 11 , Ahmet M Denli 12 , Emma R Hoppe 1 , PingHsun Hsieh 1 , Christopher M Hill 1 , Andy Wing Chun Pang 7 , Joyce Lee 7 , Ernest T Lam 7 , Susan K Dutcher 11 , Fred H Gage 12 , Wesley C Warren 11 , Jay Shendure 1, 3 , David Haussler 2, 13 , Valerie A Schneider 10 , Han Cao 7 , Mario Ventura 9 , Richard K Wilson 11 , Benedict Paten 2 , Alex Pollen 4, 14 , Evan E Eichler 1, 3
Affiliation  

A spotlight on great ape genomes Most nonhuman primate genomes generated to date have been “humanized” owing to their many gaps and the reliance on guidance by the reference human genome. To remove this humanizing effect, Kronenberg et al. generated and assembled long-read genomes of a chimpanzee, an orangutan, and two humans and compared them with a previously generated gorilla genome. This analysis recognized genomic structural variation specific to humans and particular ape lineages. Comparisons between human and chimpanzee cerebral organoids showed down-regulation of the expression of specific genes in humans, relative to chimpanzees, related to noncoding variation identified in this analysis. Science, this issue p. eaar6343 Analysis of long-read great ape and human genomes identifies human-specific changes affecting brain gene expression. INTRODUCTION Understanding the genetic differences that make us human is a long-standing endeavor that requires the comprehensive discovery and comparison of all forms of genetic variation within great ape lineages. RATIONALE The varied quality and completeness of ape genomes have limited comparative genetic analyses. To eliminate this contiguity and quality disparity, we generated human and nonhuman ape genome assemblies without the guidance of the human reference genome. These new genome assemblies enable both coarse and fine-scale comparative genomic studies. RESULTS We sequenced and assembled two human, one chimpanzee, and one orangutan genome using high-coverage (>65x) single-molecule, real-time (SMRT) long-read sequencing technology. We also sequenced more than 500,000 full-length complementary DNA samples from induced pluripotent stem cells to construct de novo gene models, increasing our knowledge of transcript diversity in each ape lineage. The new nonhuman ape genome assemblies improve gene annotation and genomic contiguity (by 30- to 500-fold), resulting in the identification of larger synteny blocks (by 22- to 74-fold) when compared to earlier assemblies. Including the latest gorilla genome, we now estimate that 83% of the ape genomes can be compared in a multiple sequence alignment. We observe a modest increase in single-nucleotide variant divergence compared to previous genome analyses and estimate that 36% of human autosomal DNA is subject to incomplete lineage sorting. We fully resolve most common repeat differences, including full-length retrotransposons such as the African ape-specific endogenous retroviral element PtERV1. We show that the spread of this element independently in the gorilla and chimpanzee lineage likely resulted from a founder element that failed to segregate to the human lineage because of incomplete lineage sorting. The improved sequence contiguity allowed a more systematic discovery of structural variation (>50 base pairs in length) (see the figure). We detected 614,186 ape deletions, insertions, and inversions, assigning each to specific ape lineages. Unbiased genome scaffolding (optical maps, bacterial artificial chromosome sequencing, and fluorescence in situ hybridization) led to the discovery of large, unknown complex inversions in gene-rich regions. Of the 17,789 fixed human-specific insertions and deletions, we focus on those of potential functional effect. We identify 90 that are predicted to disrupt genes and an additional 643 that likely affect regulatory regions, more than doubling the number of human-specific deletions that remove regulatory sequence in the human lineage. We investigate the association of structural variation with changes in human-chimpanzee brain gene expression using cerebral organoids as a proxy for expression differences. Genes associated with fixed structural variants (SVs) show a pattern of down-regulation in human radial glial neural progenitors, whereas human-specific duplications are associated with up-regulated genes in human radial glial and excitatory neurons (see the figure). CONCLUSION The improved ape genome assemblies provide the most comprehensive view to date of intermediate-size structural variation and highlight several dozen genes associated with structural variation and brain-expression differences between humans and chimpanzees. These new references will provide a stepping stone for the completion of great ape genomes at a quality commensurate with the human reference genome and, ultimately, an understanding of the genetic differences that make us human. SMRT assemblies and SV analyses. (Top) Contiguity of the de novo assemblies. (Bottom, left to right) For each ape, SVdetection was done against the human reference genome as represented by a dot plot of an inversion). Human-specific SVs, identified by comparing ape SVs and population genotyping (0/0, homozygous reference),were compared to single-cell gene expression differences [range: low (dark blue) to high (dark red)] in primary and organoid tissues. Each heatmap row is a gene that intersects an insertion or deletion (green), duplication (cyan), or inversion (light green). Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single– to mega–base pair–sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.

中文翻译:

类人猿基因组的高分辨率比较分析

聚焦类人猿基因组迄今为止,大多数非人类灵长类动物基因组都已“人性化”,因为它们存在许多空白,并且依赖于参考人类基因组的指导。为了消除这种人性化效应,Kronenberg 等人。生成并组装了黑猩猩、猩猩和两个人类的长读基因组,并将它们与之前生成的大猩猩基因组进行了比较。该分析识别了人类和特定猿谱系特有的基因组结构变异。人类和黑猩猩大脑类器官之间的比较表明,相对于黑猩猩,人类特定基因的表达下调,这与本分析中发现的非编码变异有关。科学,本期第 14 页。eaar6343 对长读类人猿和人类基因组的分析确定了影响大脑基因表达的人类特异性变化。简介 了解人类的遗传差异是一项长期的努力,需要全面发现和比较类人猿谱系内所有形式的遗传变异。基本原理 猿基因组的不同质量和完整性限制了比较遗传分析。为了消除这种连续性和质量差异,我们在没有人类参考基因组的指导下生成了人类和非人类猿基因组组装。这些新的基因组组装使得粗略和精细的比较基因组研究成为可能。结果我们使用高覆盖率 (>65x) 单分子实时 (SMRT) 长读长测序技术对两只人类、一只黑猩猩和一只猩猩基因组进行了测序和组装。我们还对来自诱导多能干细胞的超过 500,000 个全长互补 DNA 样本进行了测序,以构建从头基因模型,增加了我们对每个猿谱系转录多样性的了解。新的非人类猿基因组组装改善了基因注释和基因组连续性(提高了 30 至 500 倍),从而与早期组装相比,识别出更大的同线性块(提高了 22 至 74 倍)。包括最新的大猩猩基因组在内,我们现在估计 83% 的猿基因组可以通过多序列比对进行比较。与之前的基因组分析相比,我们观察到单核苷酸变异差异略有增加,并估计 36% 的人类常染色体 DNA 受到不完整的谱系排序。我们完全解决了最常见的重复差异,包括全长逆转录转座子,例如非洲猿特异性内源逆转录病毒元件 PtERV1。我们表明,该元素在大猩猩和黑猩猩谱系中独立的传播可能是由于由于谱系排序不完整而未能与人类谱系分离的创始人元素造成的。改进的序列连续性允许更系统地发现结构变异(长度> 50个碱基对)(见图)。我们检测到 614,186 个猿类缺失、插入和倒位,将每个片段分配给特定的猿类谱系。无偏见的基因组支架(光学图谱、细菌人工染色体测序和荧光原位杂交)导致在基因丰富的区域发现了大的、未知的复杂倒位。在 17,789 个固定的人类特异性插入和缺失中,我们重点关注那些具有潜在功能影响的插入和缺失。我们确定了 90 个预计会破坏基因的基因,以及另外 643 个可能影响调控区域的基因,这使得人类谱系中去除调控序列的人类特异性缺失数量增加了一倍多。我们使用大脑类器官作为表达差异的代表,研究了结构变异与人黑猩猩大脑基因表达变化的关联。与固定结构变异(SV)相关的基因在人类放射状胶质神经祖细胞中显示出下调模式,而人类特异性重复则与人类放射状胶质神经元和兴奋性神经元中的上调基因相关(见图)。结论 改进的猿基因组组装提供了迄今为止最全面的中等大小结构变异的视图,并突出了与人类和黑猩猩之间的结构变异和大脑表达差异相关的数十个基因。这些新的参考文献将为完成类人猿基因组提供一个垫脚石,其质量与人类参考基因组相当,并最终了解使我们成为人类的遗传差异。SMRT 装配和 SV 分析。(上)从头组装的连续性。(下,从左到右)对于每只猿,SV 检测都是针对人类参考基因组进行的,如倒置点图所示)。通过比较猿 SV 和群体基因分型(0/0,纯合参考)鉴定的人类特异性 SV,与原代细胞和类器官中的单细胞基因表达差异 [范围:低(深蓝色)到高(深红色)] 进行比较组织。每个热图行都是一个与插入或删除(绿色)、重复(青色)或倒置(浅绿色)相交的基因。人类进化的遗传研究需要高质量的连续猿基因组组装,而这些组装不受人类参考的指导。我们将长读长序列组装和全长互补 DNA 测序与多平台支架方法结合起来,从头开始生产黑猩猩和猩猩基因组组装。通过将它们与两个长读从头人类基因组组装和大猩猩基因组组装进行比较,我们表征了谱系特异性和共享的类人猿遗传变异,范围从单碱基对到兆碱基对大小的变异。我们鉴定了大约 17,000 个固定的人类特异性结构变异,识别了自非人类猿类分化以来人类中出现的基因和推定的调控变化。有趣的是,与黑猩猩的大脑类器官相比,这些变体在人类下调的基因附近富集,特别是在类似于放射状胶质神经祖细胞的细胞中。
更新日期:2018-06-07
down
wechat
bug