当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Relaxed Random Walks at Scale
Systematic Biology ( IF 6.1 ) Pub Date : 2020-07-20 , DOI: 10.1093/sysbio/syaa056
Alexander A Fisher 1 , Xiang Ji 2 , Zhenyu Zhang 3 , Philippe Lemey 4 , Marc A Suchard 1, 3, 5

Relaxed random walk (RRW) models of trait evolution introduce branch-specific rate multipliers to modulate the variance of a standard Brownian diffusion process along a phylogeny and more accurately model overdispersed biological data. Increased taxonomic sampling challenges inference under RRWs as the number of unknown parameters grows with the number of taxa. To solve this problem, we present a scalable method to efficiently fit RRWs and infer this branch-specific variation in a Bayesian framework. We develop a Hamiltonian Monte Carlo (HMC) sampler to approximate the high-dimensional, correlated posterior that exploits a closed-form evaluation of the gradient of the trait data log-likelihood with respect to all branch-rate multipliers simultaneously. Our gradient calculation achieves computational complexity that scales only linearly with the number of taxa under study. We compare the efficiency of our HMC sampler to the previously standard univariable Metropolis-Hastings approach while studying the spatial emergence of the West Nile virus in North America in the early 2000s. Our method achieves at least a 6-fold speed-increase over the univariable approach. Additionally, we demonstrate the scalability of our method by applying the RRW to study the correlation between five mammalian life history traits in a phylogenetic tree with 3650 tips.



性状进化的宽松随机游走 (RRW) 模型引入了特定于分支的速率乘数,以调节沿系统发育的标准布朗扩散过程的方差,并更准确地对过度分散的生物数据进行建模。由于未知参数的数量随着类群数量的增加而增加,分类学采样的增加对 RRW 下的推断提出了挑战。为了解决这个问题,我们提出了一种可扩展的方法来有效地适应 RRW 并在贝叶斯框架中推断出这种特定于分支的变化。我们开发了一个哈密顿蒙特卡罗(HMC)采样器来近似高维相关后验,它同时利用对所有分支率乘数的特征数据对数似然梯度的封闭式评估。我们的梯度计算实现的计算复杂性仅与所研究的分类单元的数量成线性比例。我们在研究 2000 年代初期西尼罗河病毒在北美的空间出现时,将 HMC 采样器的效率与以前标准的单变量 Metropolis-Hastings 方法进行了比较。我们的方法比单变量方法至少实现了 6 倍的速度提升。此外,我们通过应用 RRW 研究具有 3650 个提示的系统发育树中五个哺乳动物生活史特征之间的相关性,证明了我们方法的可扩展性。