当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The impact of genetic diversity statistics on model selection between coalescents
Computational Statistics & Data Analysis ( IF 1.8 ) Pub Date : 2021-04-01 , DOI: 10.1016/j.csda.2020.107055
Fabian Freund , Arno Siri-Jégousse

Abstract Modelling genetic diversity needs an underlying genealogy model. To choose a fitting model based on genetic data, one can perform model selection between classes of genealogical trees, e.g. Kingman’s coalescent with exponential growth or multiple merger coalescents. Such selection can be based on many different statistics measuring genetic diversity. A random forest based Approximate Bayesian Computation is used to disentangle the effects of different statistics on distinguishing between various classes of genealogy models. For the specific question of inferring whether genealogies feature multiple mergers, a new statistic, the minimal observable clade size, is introduced. When combined with classical site frequency based statistics, it reduces classification errors considerably.

中文翻译:

遗传多样性统计对聚结体间模型选择的影响

摘要 遗传多样性建模需要一个潜在的谱系模型。为了选择基于遗传数据的拟合模型,可以在系谱树的类别之间进行模型选择,例如具有指数增长的金曼聚结或多重合并聚结。这种选择可以基于许多不同的测量遗传多样性的统计数据。基于随机森林的近似贝叶斯计算用于解开不同统计数据对区分各类系谱模型的影响。对于推断系谱是否具有多次合并的具体问题,引入了一个新的统计数据,即最小可观察进化枝大小。当与基于站点频率的经典统计数据相结合时,它大大减少了分类错误。
更新日期:2021-04-01
down
wechat
bug