当前位置: X-MOL 学术Commun. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Universal probabilistic programming offers a powerful approach to statistical phylogenetics
Communications Biology ( IF 5.2 ) Pub Date : 2021-02-24 , DOI: 10.1038/s42003-021-01753-7
Fredrik Ronquist 1 , Jan Kudlicka 2 , Viktor Senderov 1 , Johannes Borgström 2 , Nicolas Lartillot 3 , Daniel Lundén 4 , Lawrence Murray 5 , Thomas B Schön 2 , David Broman 4
Affiliation  

Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.



中文翻译:


通用概率编程为统计系统发育学提供了一种强大的方法



目前统计系统发育分析依赖于复杂的专用软件包,这使得进化生物学家很难探索新的模型和推理策略。近年来出现了更多基于概率图模型的通用解决方案,但这种形式只能部分表达系统发育问题。在这里,我们展示了通用概率编程语言(PPL)解决了表达性问题,同时仍然支持自动生成高效的推理算法。为了证明后一点,我们开发了连续蒙特卡罗(SMC)算法的自动生成,用于任意生物多样化(出生-死亡)模型的 PPL 描述。 SMC 是针对这些问题的一种新的推理策略,支持模型测试中使用的参数推理和贝叶斯因子的有效估计。我们利用这一点为最近的几个多样化模型自动生成 SMC 算法,这些模型以前很难或不可能解决。最后,将这些算法应用于 40 种鸟类的系统发育,我们发现具有缓慢多样化、持续周转和许多小变化的模型通常可以最好地解释数据。我们的工作为 PPL 方法开辟了几个相关的问题领域,并表明在这些技术可以有效地应用于整个系统发育模型之前几乎没有障碍。

更新日期:2021-02-24
down
wechat
bug