当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Perfect Storm: Gene Tree Estimation Error, Incomplete Lineage Sorting, and Ancient Gene Flow Explain the Most Recalcitrant Ancient Angiosperm Clade, Malpighiales
bioRxiv - Evolutionary Biology Pub Date : 2020-05-27 , DOI: 10.1101/2020.05.26.112318
Liming Cai , Zhenxiang Xi , Emily Moriarty Lemmon , Alan R. Lemmon , Austin Mast , Christopher E. Buddenhagen , Liang Liu , Charles C. Davis

The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent (MSC) model and improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order alone includes nine of the top ten most unstable nodes in angiosperms, and the recalcitrant relationships along the backbone of the order have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 15%, 52%, and 32% of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution.

中文翻译:

完美风暴:基因树估计错误,谱系排序不完整和古代基因流解释了最顽强的古代被子植物进化枝Malpighiales

基因组革命为解决生命之树中的快速辐射提供了新的希望。多物种合并(MSC)模型的开发和改进的基因树估计方法可以更好地适应因不完整谱系排序(ILS)和短内部分支引起的基因树估计错误而导致的基因树异质性。但是,这些因素在物种树推断中的相对影响尚不十分清楚。使用锚定杂交富集,我们生成了一个数据集,包括来自39个科的64个分类单元的423个单拷贝基因座,以推断开花植物顺序Malpighiales的树种。仅此顺序就包括被子植物中最不稳定的前十个节点中的九个,并假设沿该阶主干的顽固关系是由于白垩纪期间的快速辐射引起的。在这里,我们表明基于聚结的方法不能解析马尔皮基耶斯人的主干,并且串联方法产生的估计不一致,提供了证据表明该进化枝的基因树异质性很高。尽管ILS和基因树估计误差水平很高,但我们的仿真表明仅这两个因素不足以解释此顺序缺乏分辨率。为了进一步探讨这一点,我们检查了经验基因树中的三重态频率,发现其中一些与归因于ILS和估计误差的频率显着不同,这表明基因流是促进Malpighiales中基因树变异的另一种且以前未被认识的现象。最后,我们应用了一种新颖的方法来量化这三个主要的基因树异质性来源的相对贡献,并证明ILS,基因树估计误差和基因流分别贡献了15%,52%和32%的变异。在一起,我们的结果表明,一个完美的因素风暴可能会影响这种分辨率的缺乏,并进一步表明,顽强的系统发育关系,例如马尔皮基亚莱斯的骨干,可能会更好地表示为系统发育网络。因此,仅将这样的组减少为严格遵守分叉树的现有模型,会极大地简化现实,并且使我们无法更清楚地识别进化过程。我们应用了一种新颖的方法来量化这三个主要来源的基因树异质性的相对贡献,并证明ILS,基因树估计误差和基因流分别贡献了15%,52%和32%的变异。在一起,我们的结果表明,一个完美的因素风暴可能会影响这种分辨率的缺乏,并进一步表明,顽强的系统发育关系,例如马尔皮基亚莱斯的骨干,可能会更好地表示为系统发育网络。因此,仅将这样的组简化为严格遵守分叉树的现有模型,会极大地简化现实,并且使我们无法更清楚地辨别进化过程。我们应用了一种新颖的方法来量化这三个主要来源的基因树异质性的相对贡献,并证明ILS,基因树估计误差和基因流分别贡献了15%,52%和32%的变异。在一起,我们的结果表明,一个完美的因素风暴可能会影响这种分辨率的缺乏,并进一步表明,顽强的系统发育关系,例如马尔皮基亚莱斯的骨干,可能会更好地表示为系统发育网络。因此,仅将这样的组减少为严格遵守分叉树的现有模型,会极大地简化现实,并且使我们无法更清楚地识别进化过程。我们的结果表明,完美的因素风暴可能会影响这种分辨率的缺乏,并进一步表明,顽强的系统发育关系(如马尔皮基亚莱斯的骨干)可能更好地表示为系统发育网络。因此,仅将这样的组减少为严格遵守分叉树的现有模型,会极大地简化了现实,并掩盖了我们更清晰地辨别进化过程的能力。我们的结果表明,完美的因素风暴可能会影响这种分辨率的缺乏,并进一步表明,顽强的系统发育关系(如马尔皮基亚莱斯的骨干)可能更好地表示为系统发育网络。因此,仅将这样的组减少为严格遵守分叉树的现有模型,会极大地简化现实,并且使我们无法更清楚地识别进化过程。
更新日期:2020-05-27
down
wechat
bug