当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Linked-read sequencing enables haplotype-resolved resequencing at population scale.
Molecular Ecology Resources ( IF 5.5 ) Pub Date : 2020-05-18 , DOI: 10.1111/1755-0998.13192
Dave Lutgen 1 , Raphael Ritter 1 , Remi-André Olsen 2 , Holger Schielzeth 1 , Joel Gruselius 3 , Philip Ewels 2 , Jesús T García 4 , Hadoram Shirihai 5 , Manuel Schweizer 5, 6 , Alexander Suh 7 , Reto Burri 1
Affiliation  

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps – are still limited by the lack of high‐quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype‐resolved genome resequencing at population scale, we investigated properties of linked‐read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25×, 20×, 15×, 10×, 7×, and 5×) with high‐coverage data (46–68×) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15× coverage, phased haplotypes span about 90% of the genome assembly, with 50% and 90% of phased sequences located in phase blocks longer than 1.25–4.6 Mb (N50) and 0.27–0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15× coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1 Mb [N50/N90] at 25× coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher‐quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase‐sized genomes like birds, linked‐read sequencing at moderate depth opens an affordable avenue towards haplotype‐resolved genome resequencing at population scale.

中文翻译:

Linked-read 测序可实现群体规模的单倍型解析重测序。

对几乎任何生物的整个基因组进行测序的可行性提供了对种群和物种进化历史的前所未有的洞察。尽管如此,许多种群基因组推断——包括混合物的量化和年代测定、基因渗入和人口统计事件,以及选择性扫描的推断——仍然受到缺乏高质量单倍型信息的限制。最新一代的测序技术现在有望取得重大进展。为了确定群体规模的单倍型解析基因组重测序的可行性,我们研究了Oenanthe属鸣禽的连锁读取测序数据的特性跨越一系列的测序深度。我们基于下采样(25×、20×、15×、10×、7×和5×)与映射到参考的七个鸟类基因组的高覆盖率数据(46-68×)的比较结果表明,定相通过适度的测序工作,已经可以达到适合大多数群体基因组分析的连续性和准确性。在 15 倍覆盖率下,定相单倍型跨越了大约 90% 的基因组组装,其中 50% 和 90% 的定相序列位于长于 1.25–4.6 Mb (N50) 和 0.27–0.72 Mb (N90) 的相位块中。从 15 倍覆盖范围开始,定相精度达到 99% 以上。更高的覆盖率产生更高的连续性(在 25 倍覆盖率下可达约 7 Mb/1 Mb [N50/N90]),但仅略微提高了定相精度。随着输入 DNA 分子长度的提高,相位阻滞连续性得到改善;因此,更高质量的 DNA 可能有助于降低测序成本。总之,即使对于鸟类等具有千兆碱基大小的基因组的生物,中等深度的连锁读取测序也为在群体规模上进行单倍型解析基因组重测序开辟了一条经济实惠的途径。
更新日期:2020-05-18
down
wechat
bug