当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards
Systematic Biology ( IF 6.5 ) Pub Date : 2004-06-01 , DOI: 10.1080/10635150490445797
Todd A Castoe 1 , Tiffany M Doan , Christopher L Parkinson
Affiliation  

Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, model-based likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examined model fit and sensitivity to such potentially heterogeneous data partitions within combined data analyses using empirical data. Here we investigate the relative model fit and sensitivity of Bayesian phylogenetic methods when alternative site-specific partitions of among-site rate variation (with and without autocorrelated rates) are considered. Our primary goal in choosing a best-fit model was to employ the simplest model that was a good fit to the data while optimizing topology and/or Bayesian posterior probabilities. Thus, we were not interested in complex models that did not practically affect our interpretation of the topology under study. We applied these alternative models to a four-gene data set including one protein-coding nuclear gene (c-mos), one protein-coding mitochondrial gene (ND4), and two mitochondrial rRNA genes (12S and 16S) for the diverse yet poorly known lizard family Gymnophthalmidae. Our results suggest that the best-fit model partitioned among-site rate variation separately among the c-mos, ND4, and 12S + 16S gene regions. We found this model yielded identical topologies to those from analyses based on the GTR+I+G model, but significantly changed posterior probability estimates of clade support. This partitioned model also produced more precise (less variable) estimates of posterior probabilities across generations of long Bayesian runs, compared to runs employing a GTR+I+G model estimated for the combined data. We use this three-way gamma partitioning in Bayesian analyses to reconstruct a robust phylogenetic hypothesis for the relationships of genera within the lizard family Gymnophthalmidae. We then reevaluate the higher-level taxonomic arrangement of the Gymnophthalmidae. Based on our findings, we discuss the utility of nontraditional parameters for modeling among-site rate variation and the implications and future directions for complex model building and testing.

中文翻译:

贝叶斯分析中的数据分区和复杂模型:裸眼蜥蜴的系统发育

包含多个基因座和多个基因组的系统发育研究正变得越来越普遍。与遗传采样的这一趋势一致,基于模型的似然技术(包括贝叶斯系统发育方法)继续流行。然而,很少有研究在使用经验数据的组合数据分析中检查模型拟合和对此类潜在异构数据分区的敏感性。在这里,我们研究了当考虑站点间速率变化的替代站点特定分区(有和没有自相关率)时贝叶斯系统发育方法的相对模型拟合和敏感性。我们选择最佳拟合模型的主要目标是在优化拓扑和/或贝叶斯后验概率的同时,采用最适合数据的最简单模型。因此,我们对实际上不会影响我们对所研究拓扑的解释的复杂模型不感兴趣。我们将这些替代模型应用于四基因数据集,包括一个蛋白质编码核基因 (c-mos)、一个蛋白质编码线粒体基因 (ND4) 和两个线粒体 rRNA 基因(12S 和 16S)已知蜥蜴科Gymnophthalmidae。我们的结果表明,最佳拟合模型在 c-mos、ND4 和 12S + 16S 基因区域之间分别划分了位点速率变化。我们发现该模型产生了与基于 GTR+I+G 模型的分析相同的拓扑结构,但显着改变了进化枝支持的后验概率估计。这个分区模型还产生了更精确(变量更少)跨代长贝叶斯运行的后验概率估计,与使用为组合数据估计的 GTR+I+G 模型的运行相比。我们在贝叶斯分析中使用这种三向伽马分区来重建蜥蜴科 Gymnophthalmidae 内属关系的强大系统发育假设。然后我们重新评估 Gymnophthalmidae 的更高级别的分类安排。根据我们的发现,我们讨论了非传统参数在建模站点间速率变化中的效用,以及复杂模型构建和测试的影响和未来方向。然后我们重新评估 Gymnophthalmidae 的更高级别的分类安排。根据我们的发现,我们讨论了非传统参数在建模站点间速率变化中的效用,以及复杂模型构建和测试的影响和未来方向。然后我们重新评估 Gymnophthalmidae 的更高级别的分类安排。根据我们的发现,我们讨论了非传统参数在建模站点间速率变化中的效用,以及复杂模型构建和测试的影响和未来方向。
更新日期:2004-06-01
down
wechat
bug