当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Phylogenetic Permulations: a statistically rigorous approach to measure confidence in associations between phenotypes and genetic elements in a phylogenetic context
bioRxiv - Evolutionary Biology Pub Date : 2020-10-14 , DOI: 10.1101/2020.10.14.338608
Elysia Saputra , Amanda Kowalczyk , Luisa Cusick , Nathan Clark , Maria Chikina

The wealth of high-quality genomes for numerous species has motivated many investigations into the genetic underpinnings of phenotypes. Comparative genomics methods approach this task by identifying convergent shifts at the genetic level that are associated with traits evolving convergently across independent lineages. However, these methods have complex statistical behaviors that are influenced by non-trivial and oftentimes unknown confounding factors. Consequently, using standard statistical analyses in interpreting the outputs of these methods leads to potentially inaccurate conclusions. Here, we introduce phylogenetic permulations, a novel statistical strategy that combines phylogenetic simulations and permutations to calculate accurate, unbiased p-values from phylogenetic methods. Permulations construct the null expectation for p-values from a given phylogenetic method by empirically generating null phenotypes. Subsequently, empirical p-values that capture the true statistical confidence given the correlation structure in the data are directly calculated based on the empirical null expectation. We examine the performance of permulation methods by analyzing both binary and continuous phenotypes, including marine, subterranean, and long-lived large-bodied mammal phenotypes. Our results reveal that permulations improve the statistical power of phylogenetic analyses and correctly calibrate statements of confidence in rejecting complex null distributions while maintaining or improving the enrichment of known functions related to the phenotype. We also find that permulations refine pathway enrichment analyses by correcting for non-independence in gene ranks. Our results demonstrate that permulations are a powerful tool for improving statistical confidence in the conclusions of phylogenetic analysis when the parametric null is unknown.

中文翻译:

系统发生的扩散:在系统发育的背景下,一种统计学上严格的方法来测量对表型和遗传元素之间关联的置信度

众多物种的高质量基因组的丰富性激发了许多对表型遗传基础的研究。比较基因组学方法通过在遗传水平上确定与独立谱系中趋于进化的性状相关的趋同性转移来完成这项任务。但是,这些方法具有复杂的统计行为,这些行为受非平凡且通常未知的混杂因素影响。因此,使用标准统计分析来解释这些方法的输出会导致潜在的错误结论。在这里,我们介绍系统发育扩散,这是一种新颖的统计策略,结合了系统发育模拟和置换,可以根据系统发育方法计算出准确,无偏的p值。通过凭经验生成无效表型,扩散可根据给定的系统发育方法构造对p值的无效期望。随后,基于经验空预期,直接计算在给定数据中的相关结构的情况下捕获真实统计置信度的经验p值。我们通过分析二元和连续表型(包括海洋,地下和长寿大体哺乳动物表型)来检查渗透方法的性能。我们的结果表明,散布提高了系统发育分析的统计能力,并正确校准了拒绝复杂零分布的置信度,同时保持或改善了与表型有关的已知功能。我们还发现,通过校正基因等级中的非独立性,杂交完善了途径富集分析。我们的结果表明,当参数空值未知时,散布是用于增强系统发育分析结论中统计置信度的强大工具。
更新日期:2020-10-16
down
wechat
bug