当前位置: X-MOL 学术Mol. Biol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A high coverage Mesolithic aurochs genome and effective leveraging of ancient cattle genomes using whole genome imputation
Molecular Biology and Evolution ( IF 10.7 ) Pub Date : 2024-04-25 , DOI: 10.1093/molbev/msae076
Jolijn A M Erven 1, 2 , Amelie Scheu 2, 3 , Marta Pereira Verdugo 2 , Lara Cassidy 2 , Ningbo Chen 4 , Birgit Gehlen 5 , Martin Street 6 , Ole Madsen 7 , Victoria E Mullin 2
Affiliation  

Ancient genomic analyses are often restricted to utilising pseudo-haploid data due to low genome coverage. Leveraging low coverage data by imputation to calculate phased diploid genotypes that enable haplotype-based interrogation and SNP calling at unsequenced positions is highly desirable. This has not been investigated for ancient cattle genomes despite these being compelling subjects for archaeological, evolutionary and economic reasons. Here we test this approach by sequencing a Mesolithic European aurochs (18.49x; 9852-9376 calBCE), an Early Medieval European cow (18.69x; 427-580 calCE), and combine these with published individuals; two ancient and three modern. We downsample these genomes (0.25x, 0.5x, 1.0x, 2.0x) and impute diploid genotypes, utilising a reference panel of 171 published modern cattle genomes that we curated for 21.7 million (Mn) phased single-nucleotide polymorphisms (SNPs). We recover high densities of correct calls with an accuracy of >99.1% at variant sites for the lowest downsample depth of 0.25x, increasing to >99.5% for 2.0x (transversions only, minor allele frequency (MAF) ≥ 2.5%). The recovery of SNPs correlates with coverage, on average 58% of sites are recovered for 0.25x increasing to 87% for 2.0x, utilising an average of 3.5 million (Mn) transversions (MAF ≥2.5%), even in the aurochs, despite the highest temporal distance from the modern reference panel. Our imputed genomes behave similarly to directly called data in allele-frequency-based analyses; for example consistently identifying runs of homozygosity >2mb, including a long homozygous region in the Mesolithic European aurochs.

中文翻译:

高覆盖率的中石器时代野牛基因组以及使用全基因组插补有效利用古代牛基因组

由于基因组覆盖率较低,古代基因组分析通常仅限于利用伪单倍体数据。通过插补利用低覆盖率数据来计算定相二倍体基因型,从而能够在未测序的位置进行基于单倍型的询问和 SNP 调用。尽管出于考古、进化和经济原因,古代牛基因组是引人注目的课题,但尚未对它们进行研究。在这里,我们通过对中石器时代的欧洲野牛(18.49x;9852-9376 calBCE)和中世纪早期的欧洲牛(18.69x;427-580 calCE)进行测序来测试这种方法,并将它们与已发表的个体结合起来;两个古代,三个现代。我们对这些基因组进行下采样(0.25x、0.5x、1.0x、2.0x)并估算二倍体基因型,利用由 171 个已发表的现代牛基因组组成的参考面板,我们为 2170 万个 (Mn) 定相单核苷酸多态性 (SNP) 进行了策划。我们在最低下采样深度为 0.25 倍时恢复了高密度的正确识别,在变异位点的准确度为 > 99.1%,对于 2.0 倍,增加到 > 99.5%(仅颠换,次要等位基因频率 (MAF) ≥ 2.5%) 。 SNP 的恢复与覆盖率相关,平均 58% 的位点在 0.25 倍时恢复,在 2.0 倍时恢复到 87%,平均使用 350 万个 (Mn) 颠换 (MAF ≥ 2.5%),即使在野牛中也是如此,尽管与现代参考面板的最高时间距离。我们的估算基因组的行为与基于等位基因频率的分析中直接调用的数据类似;例如一致地鉴定纯合性>2mb的运行,包括中石器时代欧洲野牛中的长纯合区域。
更新日期:2024-04-25
down
wechat
bug