当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology
Systematic Biology ( IF 6.1 ) Pub Date : 2019-08-28 , DOI: 10.1093/sysbio/syz046
Mathieu Fourment 1 , Andrew F Magee 2 , Chris Whidden 3 , Arman Bilge 3 , Frederick A Matsen 3 , Vladimir N Minin 4
Affiliation  

The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real datasets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.

中文翻译:

计算系统发育树拓扑的边际似然的 19 种可疑方法

模型的边际似然是评估支持模型的数据提供的证据的关键数量。边际似然是后验密度的归一化常数,通过将似然与先验的乘积相对于模型参数积分而获得。因此,计算边际似然的计算负担与参数空间的维度成比例。在系统发育学中,我们使用高维模型的树拓扑,计算边际似然的标准方法非常缓慢。在这里,我们研究了快速计算单个固定树拓扑的边际似然的方法。我们对 19 种不同方法的速度和准确性进行了基准测试,以计算 JC69 模型下一组真实数据集上系统发育拓扑的边际可能性。这些方法包括我们为解决这个问题而明确开发的几种新方法,以及我们首次应用于系统发育模型的现有算法。总之,我们的结果表明这些方法的准确性差异很大,并且准确性不一定与计算负担相关。我们新开发的方法比标准方法快几个数量级,并且在某些情况下,它们的准确性可以与最好的估计方法相媲美。
更新日期:2019-08-28
down
wechat
bug