当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating the role of reference-genome phylogenetic distance on evolutionary inference
Molecular Ecology Resources ( IF 7.7 ) Pub Date : 2021-06-27 , DOI: 10.1111/1755-0998.13457
Aparna Prasad 1 , Eline D Lorenzen 1 , Michael V Westbury 1
Affiliation  

When a high-quality genome assembly of a target species is unavailable, an option to avoid the costly de novo assembly process is a mapping-based assembly. However, mapping shotgun data to a distant relative may lead to biased or erroneous evolutionary inference. Here, we used short-read data from a mammal (beluga whale) and a bird species (rowi kiwi) to evaluate whether reference genome phylogenetic distance can impact downstream demographic (Pairwise Sequentially Markovian Coalescent) and genetic diversity (heterozygosity, runs of homozygosity) analyses. We mapped to assemblies of species of varying phylogenetic distance (from conspecific to genome-wide divergence of >7%), and de novo assemblies created using cross-species scaffolding. We show that while reference genome phylogenetic distance has an impact on demographic analyses, it is not pronounced until using a reference genome with >3% divergence from the target species. When mapping to cross-species scaffolded assemblies, we are unable to replicate the original beluga demographic results, but are able with the rowi kiwi, presumably reflecting the more fragmented nature of the beluga assemblies. We find that increased phylogenetic distance has a pronounced impact on genetic diversity estimates; heterozygosity estimates deviate incrementally with increasing phylogenetic distance. Moreover, runs of homozygosity are largely undetectable when mapping to any nonconspecific assembly. However, these biases can be reduced when mapping to a cross-species scaffolded assembly. Taken together, our results show that caution should be exercised when selecting reference genomes. Cross-species scaffolding may offer a way to avoid a costly, traditional de novo assembly, while still producing robust, evolutionary inference.

中文翻译:

评估参考基因组系统发育距离在进化推理中的作用

当目标物种的高质量基因组组装不可用时,避免昂贵的从头组装过程的一种选择是基于映射的组装。然而,将猎枪数据映射到远亲可能会导致有偏见或错误的进化推断。在这里,我们使用来自哺乳动物(白鲸)和鸟类(猕猴桃)的短读数据来评估参考基因组系统发育距离是否会影响下游人口统计(成对序列马尔可夫聚合)和遗传多样性(杂合性,纯合性运行)分析。我们绘制了不同系统发育距离的物种集合(从同种到 >7% 的全基因组差异),并从头使用跨物种脚手架创建的组件。我们表明,虽然参考基因组系统发育距离对人口统计分析有影响,但直到使用与目标物种的差异 > 3% 的参考基因组才会明显。当映射到跨物种支架组件时,我们无法复制原始的白鲸人口统计结果,但可以使用 rowi kiwi,大概反映了白鲸组件更加分散的性质。我们发现系统发育距离的增加对遗传多样性估计有显着影响;杂合性估计随着系统发育距离的增加而逐渐偏离。此外,当映射到任何非同种组装时,纯合子的运行在很大程度上是无法检测到的。但是,映射到跨物种支架组件时可以减少这些偏差。总之,我们的结果表明,在选择参考基因组时应谨慎行事。跨物种脚手架可能提供一种方法来避免昂贵的传统从头组装,同时仍然产生强大的进化推理。
更新日期:2021-06-27
down
wechat
bug