当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Darwinian Uncertainty Principle
Systematic Biology ( IF 6.1 ) Pub Date : 2019-09-21 , DOI: 10.1093/sysbio/syz054
Olivier Gascuel 1 , Mike Steel 2
Affiliation  

Abstract Reconstructing ancestral characters and traits along a phylogenetic tree is central to evolutionary biology. It is the key to understanding morphology changes among species, inferring ancestral biochemical properties of life, or recovering migration routes in phylogeography. The goal is 2-fold: to reconstruct the character state at the tree root (e.g., the region of origin of some species) and to understand the process of state changes along the tree (e.g., species flow between countries). We deal here with discrete characters, which are “unique,” as opposed to sequence characters (nucleotides or amino-acids), where we assume the same model for all the characters (or for large classes of characters with site-dependent models) and thus benefit from multiple information sources. In this framework, we use mathematics and simulations to demonstrate that although each goal can be achieved with high accuracy individually, it is generally impossible to accurately estimate both the root state and the rates of state changes along the tree branches, from the observed data at the tips of the tree. This is because the global rates of state changes along the branches that are optimal for the two estimation tasks have opposite trends, leading to a fundamental trade-off in accuracy. This inherent “Darwinian uncertainty principle” concerning the simultaneous estimation of “patterns” and “processes” governs ancestral reconstructions in biology. For certain tree shapes (typically speciation trees) the uncertainty of simultaneous estimation is reduced when more tips are present; however, for other tree shapes it does not (e.g., coalescent trees used in population genetics).

中文翻译:

达尔文的不确定性原理

摘要 沿系统发育树重建祖先特征和性状是进化生物学的核心。它是理解物种间形态变化、推断生命祖先生化特性或恢复系统地理学迁移路线的关键。目标是 2 重:重建树根处的特征状态(例如,某些物种的起源区域)并了解沿树状态变化的过程(例如,国家之间的物种流动)。我们在这里处理离散字符,它们是“独特的”,而不是序列字符(核苷酸或氨基酸),我们假设所有字符(或具有位点相关模型的大类字符)使用相同的模型,并且从而受益于多种信息来源。在这个框架中,我们使用数学和模拟来证明,虽然每个目标都可以单独高精度地实现,但通常不可能从根尖的观察数据中准确估计根状态和沿树枝的状态变化率。树。这是因为沿着对两个估计任务最佳的分支的全局状态变化率具有相反的趋势,导致准确性的基本权衡。这种关于同时估计“模式”和“过程”的固有“达尔文不确定性原理”支配着生物学中的祖先重建。对于某些树形(通常是物种形成树),当存在更多提示时,同时估计的不确定性会降低;然而,对于其他树形状,它不会(例如,
更新日期:2019-09-21
down
wechat
bug