当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling Pulsed Evolution and Time-Independent Variation Improves the Confidence Level of Ancestral and Hidden State Predictions
Systematic Biology ( IF 6.1 ) Pub Date : 2022-02-23 , DOI: 10.1093/sysbio/syac016
Yingnan Gao 1 , Martin Wu 1
Affiliation  

Ancestral state reconstruction is not only a fundamental tool for studying trait evolution, but also very useful for predicting the unknown trait values (hidden states) of extant species. A well-known problem in ancestral and hidden state predictions is that the uncertainty associated with predictions can be so large that predictions themselves are of little use. Therefore, for meaningful interpretation of predicted traits and hypothesis testing, it is prudent to accurately assess the uncertainty of the predictions. Commonly used constant-rate Brownian motion (BM) model fails to capture the complexity of tempo and mode of trait evolution in nature, making predictions under the BM model vulnerable to lack-of-fit errors from model misspecification. Using empirical data (mammalian body size and bacterial genome size), we show that the distribution of residual Z-scores under the BM model is neither homoscedastic nor normal as expected. Consequently, the 95% confidence intervals of predicted traits are so unreliable that the actual coverage probability ranges from 33% (strongly permissive) to 100% (strongly conservative). Alternative methods such as BayesTraits and StableTraits that allow variable rates in evolution improve the predictions but are computationally expensive. Here, we develop Reconstructing Ancestral State under Pulsed Evolution in R by Gaussian Decomposition (RasperGade), a method of ancestral and hidden state prediction that uses the Levy process to explicitly model gradual evolution, pulsed evolution, and time-independent variation. Using the same empirical data, we show that RasperGade outperforms both BayesTraits and StableTraits in providing reliable confidence estimates and is orders-of-magnitude faster. Our results suggest that, when predicting the ancestral and hidden states of continuous traits, the rate variation should always be assessed and the quality of confidence estimates should always be examined. [Bacterial genomic traits; model misspecification; trait evolution.]

中文翻译:

对脉冲演化和与时间无关的变化进行建模提高了祖先和隐藏状态预测的置信度

祖先状态重建不仅是研究性状进化的基本工具,而且对于预测现存物种的未知性状值(隐藏状态)也非常有用。祖先和隐藏状态预测中的一个众所周知的问题是,与预测相关的不确定性可能非常大,以至于预测本身几乎没有用处。因此,为了对预测特征和假设检验进行有意义的解释,准确评估预测的不确定性是谨慎的。常用的恒定速率布朗运动 (BM) 模型无法捕捉自然界中性状进化的节奏和模式的复杂性,使得 BM 模型下的预测容易受到模型错误指定的失拟误差的影响。使用经验数据(哺乳动物体型和细菌基因组大小),我们表明,BM 模型下剩余 Z 分数的分布既不是同方差的,也不是预期的正态分布。因此,预测性状的 95% 置信区间非常不可靠,以至于实际覆盖概率的范围从 33%(非常允许)到 100%(非常保守)。BayesTraits 和 StableTraits 等替代方法允许进化中的可变速率改进预测,但计算成本高。在这里,我们开发了通过高斯分解 (RasperGade) 在 R 中的脉冲进化下重建祖先状态,这是一种祖先和隐藏状态预测的方法,它使用 Levy 过程来显式地模拟渐进进化、脉冲进化和与时间无关的变化。使用相同的经验数据,我们表明,RasperGade 在提供可靠的置信度估计方面优于 BayesTraits 和 StableTraits,并且速度快了几个数量级。我们的结果表明,在预测连续性状的祖先和隐藏状态时,应始终评估速率变化,并始终检查置信度估计的质量。[细菌基因组特征;型号错误;性状进化。]
更新日期:2022-02-23
down
wechat
bug