当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Universal probabilistic programming offers a powerful approach to statistical phylogenetics
bioRxiv - Evolutionary Biology Pub Date : 2020-12-10 , DOI: 10.1101/2020.06.16.154443
Fredrik Ronquist , Jan Kudlicka , Viktor Senderov , Johannes Borgström , Nicolas Lartillot , Daniel Lundén , Lawrence Murray , Thomas B. Schön , David Broman

Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.

中文翻译:

通用概率编程为统计系统发生学提供了一种有力的方法

统计系统发生学分析目前依赖于复杂的专用软件包,这使得进化生物学家难以探索新的模型和推理策略。近年来,已经看到了更多基于概率图形模型的通用解决方案,但是这种形式主义只能部分表示系统发育问题。在这里,我们展示了通用概率编程语言(PPL)解决了表达性问题,同时仍支持自动生成高效的推理算法。为了证明后者,我们开发了自动生成的顺序蒙特卡罗(SMC)算法,用于任意生物多样性(出生死亡)模型的PPL描述。SMC是针对这些问题的新推理策略,支持参数推断和模型测试中使用的贝叶斯因子的有效估计。我们利用此优势自动为以前难以解决或无法解决的几种最近的多元化模型生成SMC算法。最后,将这些算法应用到40种鸟类系统发育中,我们表明,具有缓慢分散,恒定周转和许多小变化的模型通常可以最好地解释数据。我们的工作为PPL方法打开了几个相关的问题领域,并表明在将这些技术有效地应用于所有系统发育模型之前,几乎没有障碍。将这些算法应用到40种鸟类系统发育中,我们发现具有缓慢分散,恒定周转和许多小变化的模型通常可以最好地解释数据。我们的工作为PPL方法打开了几个相关的问题领域,并表明在将这些技术有效地应用于所有系统发育模型之前,几乎没有障碍。将这些算法应用到40种鸟类系统发育中,我们发现具有缓慢分散,恒定周转和许多小变化的模型通常可以最好地解释数据。我们的工作为PPL方法打开了几个相关的问题领域,并表明在将这些技术有效地应用于所有系统发育模型之前,几乎没有障碍。
更新日期:2020-12-11
down
wechat
bug