当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assembly and annotation of an Ashkenazi human reference genome
Genome Biology ( IF 10.1 ) Pub Date : 2020-06-02 , DOI: 10.1186/s13059-020-02047-7
Alaina Shumate 1, 2 , Aleksey V Zimin 1, 2 , Rachel M Sherman 1, 3 , Daniela Puiu 1, 3 , Justin M Wagner 4 , Nathan D Olson 4 , Mihaela Pertea 1, 2 , Marc L Salit 5 , Justin M Zook 4 , Steven L Salzberg 1, 2, 3, 6
Affiliation  

Background Thousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases. Results Here, we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are > 99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. Forty of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. Eleven genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~ 1 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes. Conclusions The Ash1 genome is presented as a reference for any genetic studies involving Ashkenazi Jewish individuals.

中文翻译:


德系人类参考基因组的组装和注释



背景 每年有数以千计的实验和研究使用人类参考基因组作为资源。这个单一参考基因组 GRCh38 是由少数个体创建的镶嵌体,代表了人口中非常小的样本。需要来自多个人群的参考基因组以避免潜在的偏差。结果在这里,我们描述了德系犹太人基因组的组装和注释,以及新的、特定于人群的人类参考基因组的创建。该基因组比最新版本的人类参考基因组GRCh38更连续、更完整,并且注释有高度相似的基因内容。 Ashkenazi 参考基因组 Ash1 包含 2,973,118,650 个核苷酸,而 GRCh38 包含 2,937,639,212 个核苷酸。注释识别出 20,157 个蛋白质编码基因,其中 19,563 个与 GRCh38 上的对应基因具有 > 99% 的一致性。其余的大多数基因都有很小的差异。 GRCh38 中的 40 个蛋白质编码基因在 Ash1 中缺失;然而,所有这些基因都是多基因家族的成员,其中 Ash1 包含其他拷贝。 11 个基因与 GRCh38 中的同源基因出现在不同的染色体上。将不相关的德系犹太人个体的 DNA 序列与 Ash1 进行比对,比将相同序列与更远的 GRCh38 基因组进行比对,发现纯合 SNP 少约 100 万个,这说明了群体特异性参考基因组的好处之一。结论 Ash1 基因组可作为涉及德系犹太个体的任何遗传研究的参考。
更新日期:2020-06-02
down
wechat
bug