当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Parsimony-Guided Tree Proposals to Accelerate Convergence in Bayesian Phylogenetic Inference
Systematic Biology ( IF 6.5 ) Pub Date : 2020-01-27 , DOI: 10.1093/sysbio/syaa002
Chi Zhang 1, 2 , John P Huelsenbeck 3 , Fredrik Ronquist 4
Affiliation  

Abstract Sampling across tree space is one of the major challenges in Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) algorithms. Standard MCMC tree moves consider small random perturbations of the topology, and select from candidate trees at random or based on the distance between the old and new topologies. MCMC algorithms using such moves tend to get trapped in tree space, making them slow in finding the globally most probable trees (known as “convergence”) and in estimating the correct proportions of the different types of them (known as “mixing”). Here, we introduce a new class of moves, which propose trees based on their parsimony scores. The proposal distribution derived from the parsimony scores is a quickly computable albeit rough approximation of the conditional posterior distribution over candidate trees. We demonstrate with simulations that parsimony-guided moves correctly sample the uniform distribution of topologies from the prior. We then evaluate their performance against standard moves using six challenging empirical data sets, for which we were able to obtain accurate reference estimates of the posterior using long MCMC runs, a mix of topology proposals, and Metropolis coupling. On these data sets, ranging in size from 357 to 934 taxa and from 1740 to 5681 sites, we find that single chains using parsimony-guided moves usually converge an order of magnitude faster than chains using standard moves. They also exhibit better mixing, that is, they cover the most probable trees more quickly. Our results show that tree moves based on quick and dirty estimates of the posterior probability can significantly outperform standard moves. Future research will have to show to what extent the performance of such moves can be improved further by finding better ways of approximating the posterior probability, taking the trade-off between accuracy and speed into account. [Bayesian phylogenetic inference; MCMC; parsimony; tree proposal.]

中文翻译:

使用简约引导树建议加速贝叶斯系统发育推断的收敛

摘要 跨树空间采样是使用马尔可夫链蒙特卡罗 (MCMC) 算法进行贝叶斯系统发育推断的主要挑战之一。标准 MCMC 树移动考虑拓扑的小随机扰动,并随机或基于新旧拓扑之间的距离从候选树中进行选择。使用此类移动的 MCMC 算法往往会陷入树空间中,使它们在寻找全局最可能的树(称为“收敛”)和估计不同类型的树的正确比例(称为“混合”)方面变慢。在这里,我们引入了一类新的移动,它们根据树的简约分数提出树。从简约分数派生的提议分布是候选树上条件后验分布的粗略近似,但可以快速计算。我们通过模拟证明了简约引导的移动正确地采样了先验拓扑的均匀分布。然后,我们使用六个具有挑战性的经验数据集评估它们对标准移动的性能,为此我们能够使用长 MCMC 运行、拓扑建议的混合和 Metropolis 耦合获得后验的准确参考估计。在这些数据集上,大小从 357 到 934 个分类群和从 1740 到 5681 个站点,我们发现使用简约引导移动的单链通常比使用标准移动的链收敛速度快一个数量级。它们还表现出更好的混合,也就是说,它们更快地覆盖了最可能的树。我们的结果表明,基于后验概率的快速和肮脏估计的树移动可以显着优于标准移动。未来的研究将必须通过找到近似后验概率的更好方法,同时考虑准确性和速度之间的权衡,来表明可以在多大程度上进一步提高此类移动的性能。[贝叶斯系统发育推断;MCMC; 简约;树提议。]
更新日期:2020-01-27
down
wechat
bug