当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Insights into human genetic variation and population history from 929 diverse genomes
Science ( IF 44.7 ) Pub Date : 2020-03-19 , DOI: 10.1126/science.aay5012
Anders Bergström 1, 2 , Shane A McCarthy 1, 3 , Ruoyun Hui 3, 4 , Mohamed A Almarri 1 , Qasim Ayub 1, 5, 6 , Petr Danecek 1 , Yuan Chen 1 , Sabine Felkel 1, 7 , Pille Hallast 1, 8 , Jack Kamm 1, 3, 9 , Hélène Blanché 10, 11 , Jean-François Deleuze 10, 11 , Howard Cann 10 , Swapan Mallick 12, 13 , David Reich 12, 13 , Manjinder S Sandhu 1, 14 , Pontus Skoglund 2 , Aylwyn Scally 3 , Yali Xue 1 , Richard Durbin 1, 3 , Chris Tyler-Smith 1
Affiliation  

Genomes from around the globe Genomic sequencing of diverse human populations to understand overall genetic diversity has lagged behind in-depth examination of specific populations. To add to our understanding of human genetic diversity, Bergström et al. generated whole-genome sequences surveying individuals in the Human Genome Diversity Project, which is a panel of global populations that has been instrumental in understanding the history of human populations. The authors' study adds data about African, Oceanian, and Amerindian populations and indicates that diversity tends to result from differences at the single-nucleotide level rather than copy number variation. An analysis of archaic sequences in modern populations identifies ancestral genetic variation in African populations that likely predates modern humans and has been lost in most non-African populations. Science, this issue p. eaay5012 Genomes from diverse human populations record human genetic diversity and illuminate the history of our species. INTRODUCTION Large-scale human genome-sequencing studies to date have been limited to large, metropolitan populations or to small numbers of genomes from each group. Much remains to be understood about the extent and structure of genetic variation in our species and how it was shaped by past population separations, admixture, adaptation, size changes, and gene flow from archaic human groups. Larger numbers of genome sequences from more diverse populations are needed to answer these questions. RATIONALE We sequenced 929 genomes from 54 geographically, linguistically, and culturally diverse human populations to an average of 35× coverage and analyzed the variation among them. We also physically resolved the haplotype phase of 26 of these genomes using linked-read sequencing. RESULTS We identified 67.3 million single-nucleotide polymorphisms, 8.8 million small insertions or deletions (indels), and 40,736 copy number variants. This includes hundreds of thousands of variants that had not been discovered by previous sequencing efforts, but which are common in one or more population. We demonstrate benefits to the study of population relationships of genome sequences over ascertained array genotypes, particularly when involving African populations. Populations in central and southern Africa, the Americas, and Oceania each harbor tens to hundreds of thousands of private, common genetic variants. Most of these variants arose as new mutations rather than through archaic introgression, except in Oceanian populations, where many private variants derive from Denisovan admixture. Although some reach high frequencies, no variants are fixed between major geographical regions. We estimate that the genetic separation between present-day human populations occurred mostly within the past 250,000 years. However, these early separations were gradual in nature and shaped by protracted gene flow. All populations thus still had some genetic contact more recently than this, but there is also evidence that a small fraction of present-day structure might be hundreds of thousands of years older. Most populations expanded in size over the past 10,000 years, but hunter-gatherer groups did not. The low diversity among the Neanderthal haplotypes segregating in present-day populations indicates that, while more than one Neanderthal individual must have contributed genetic material to modern humans, there was likely only one major episode of admixture. By contrast, Denisovan haplotype diversity reflects a more complex history involving more than one episode of admixture. We found small amounts of Neanderthal ancestry in West African genomes, most likely reflecting Eurasian admixture. Despite their very low levels or absence of archaic ancestry, African populations share many Neanderthal and Denisovan variants that are absent from Eurasia, reflecting how a larger proportion of the ancestral human variation has been maintained in Africa. CONCLUSION The discovery of substantial amounts of common genetic variation that was previously undocumented and is geographically restricted highlights the continued value of anthropologically informed study designs for understanding human diversity. The genome sequences presented here are a freely available resource with relevance to population history, medical genetics, anthropology, and linguistics. Structure of genetic variation across worldwide human populations. Shown is a schematic illustration of the approximate amounts of four different classes of genetic variation found in different geographical regions. The origins of the populations included in the study are indicated by dots. Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.

中文翻译:


从 929 个不同的基因组中洞察人类遗传变异和种群历史



来自全球各地的基因组对不同人群进行基因组测序以了解总体遗传多样性已经落后于对特定人群的深入研究。为了增加我们对人类遗传多样性的理解,Bergström 等人。生成了对人类基因组多样性计划中的个体进行调查的全基因组序列,该计划是一个全球人口小组,有助于了解人类的历史。作者的研究添加了有关非洲、大洋洲和美洲印第安人种群的数据,并表明多样性往往是由单核苷酸水平的差异而不是拷贝数变异引起的。对现代人群中古老序列的分析确定了非洲人群中的祖先遗传变异,这些变异可能早于现代人类,并且在大多数非非洲人群中已经丢失。科学,本期第 14 页。 eaay5012 来自不同人群的基因组记录了人类遗传多样性并阐明了我们物种的历史。简介 迄今为止,大规模人类基因组测序研究仅限于大型都市人群或每个群体的少量基因组。关于我们物种遗传变异的程度和结构,以及过去的种群分离、混合、适应、体型变化和来自古代人类群体的基因流动如何塑造它,还有很多事情有待了解。需要来自更多样化人群的更多基因组序列来回答这些问题。基本原理 我们对来自 54 个地理、语言和文化多样化人群的 929 个基因组进行了测序,平均覆盖度为 35 倍,并分析了它们之间的变异。 我们还使用链读测序物理解析了其中 26 个基因组的单倍型阶段。结果 我们鉴定了 6730 万个单核苷酸多态性、880 万个小插入或缺失 (indel) 以及 40,736 个拷贝数变异。这包括之前的测序工作尚未发现的数十万种变异,但这些变异在一个或多个群体中很常见。我们证明了通过确定的阵列基因型研究基因组序列的群体关系的好处,特别是在涉及非洲群体时。非洲中部和南部、美洲和大洋洲的人群各自拥有数万至数十万个私人的、常见的基因变异。大多数这些变异都是作为新突变而不是通过古老的基因渗入而产生的,但大洋洲人群除外,其中许多私人变异源自丹尼索瓦人的混合。尽管有些频率很高,但主要地理区域之间没有固定的变体。我们估计,当今人类群体之间的基因分离大多发生在过去 25 万年之内。然而,这些早期的分离本质上是渐进的,并且是由长期的基因流决定的。因此,所有种群在最近的时间里仍然有一些遗传接触,但也有证据表明,当今结构的一小部分可能比这早数十万年。在过去的一万年里,大多数人口规模都在扩大,但狩猎采集群体却没有。当今人群中分离的尼安德特人单倍型的低多样性表明,虽然一定有不止一个尼安德特人向现代人类贡献了遗传物质,但很可能只有一次主要的混合事件。 相比之下,丹尼索瓦人的单倍型多样性反映了更复杂的历史,涉及多个混合事件。我们在西非基因组中发现了少量的尼安德特人血统,很可能反映了欧亚混血。尽管非洲人口的水平非常低或不存在远古血统,但他们拥有许多欧亚大陆所没有的尼安德特人和丹尼索瓦人的变种,这反映出较大比例的人类祖先变异是如何在非洲得以保留的。结论 大量常见遗传变异的发现,这些变异以前没有记录在案,而且受到地理限制,凸显了基于人类学的研究设计对于理解人类多样性的持续价值。这里提供的基因组序列是与人口历史、医学遗传学、人类学和语言学相关的免费资源。全球人群遗传变异的结构。所示为在不同地理区域发现的四种不同类别遗传变异的大致数量的示意图。研究中包含的人群的起源用点表示。需要来自不同人类群体的基因组序列来了解我们物种的遗传变异结构以及不同群体的历史和之间的关系。我们展示了来自 54 个不同人群的 929 个高覆盖率基因组序列,其中 26 个使用链读测序进行了物理定相。对这些基因组的分析揭示了南部非洲、中部非洲、大洋洲和美洲特有的大量先前未记录的常见遗传变异,但主要地理区域之间不存在固定的此类变异。 我们还发现非洲内部存在深刻而渐进的人口分离,对比了过去一万年中狩猎采集群体和农业群体之间的人口规模历史,以及单个尼安德特人和多个丹尼索瓦人来源人口对当今人口的贡献之间的对比。
更新日期:2020-03-19
down
wechat
bug