当前位置: X-MOL 学术Methods Ecol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying consistent allele frequency differences in studies of stratified populations.
Methods in Ecology and Evolution ( IF 6.3 ) Pub Date : 2017-06-15 , DOI: 10.1111/2041-210x.12810
R Axel W Wiberg 1 , Oscar E Gaggiotti 2 , Michael B Morrissey 1 , Michael G Ritchie 1
Affiliation  

  1. With increasing application of pooled‐sequencing approaches to population genomics robust methods are needed to accurately quantify allele frequency differences between populations. Identifying consistent differences across stratified populations can allow us to detect genomic regions under selection and that differ between populations with different histories or attributes. Current popular statistical tests are easily implemented in widely available software tools which make them simple for researchers to apply. However, there are potential problems with the way such tests are used, which means that underlying assumptions about the data are frequently violated.
  2. These problems are highlighted by simulation of simple but realistic population genetic models of neutral evolution and the performance of different tests are assessed. We present alternative tests (including Generalised Linear Models [GLMs] with quasibinomial error structure) with attractive properties for the analysis of allele frequency differences and re‐analyse a published dataset.
  3. The simulations show that common statistical tests for consistent allele frequency differences perform poorly, with high false positive rates. Applying tests that do not confound heterogeneity and main effects significantly improves inference. Variation in sequencing coverage likely produces many false positives and re‐scaling allele frequencies to counts out of a common value or an effective sample size reduces this effect.
  4. Many researchers are interested in identifying allele frequencies that vary consistently across replicates to identify loci underlying phenotypic responses to selection or natural variation in phenotypes. Popular methods that have been suggested for this task perform poorly in simulations. Overall, quasibinomial GLMs perform better and also have the attractive feature of allowing correction for multiple testing by standard procedures and are easily extended to other designs.


中文翻译:

在分层人群的研究中确定一致的等位基因频率差异。

  1. 随着池测序技术在人群基因组学中的应用越来越广泛,需要鲁棒的方法来准确地量化人群之间的等位基因频率差异。识别分层种群之间的一致差异可以使我们能够检测到所选择的基因组区域,以及具有不同历史或属性的种群之间的差异。当前流行的统计测试可以在广泛可用的软件工具中轻松实现,从而使研究人员易于应用。但是,使用此类测试的方式存在潜在的问题,这意味着经常违反有关数据的基本假设。
  2. 通过简单但现实的中性进化种群遗传模型的仿真突出了这些问题,并评估了不同测试的性能。我们提出了具有吸引人的特性的替代测试(包括具有准二项式误差结构的广义线性模型[GLM]),用于分析等位基因频率差异并重新分析已发布的数据集。
  3. 模拟显示,针对一致的等位基因频率差异的常规统计测试效果较差,假阳性率较高。应用不会混淆异质性和主要影响的测试可以显着改善推断。测序覆盖率的变化可能会产生许多假阳性,重新调整等位基因频率以计数出一个共同值或有效的样本量会减少这种影响。
  4. 许多研究人员对识别等位基因频率感兴趣,这些等位基因频率在各个重复样本中一致地变化,以识别潜在的表型对选择或表型自然变化的表位响应。已建议用于此任务的流行方法在模拟中效果较差。总体而言,准二项式GLM的性能更好,并且具有吸引人的功能,可以通过标准程序对多个测试进行校正,并且很容易扩展到其他设计。
更新日期:2017-06-15
down
wechat
bug