当前位置: X-MOL 学术Genetica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
False and true positives in arthropod thermal adaptation candidate gene lists
Genetica ( IF 1.5 ) Pub Date : 2021-05-07 , DOI: 10.1007/s10709-021-00122-w
Maike Herrmann 1, 2 , Lev Y Yampolsky 3
Affiliation  

Genome-wide studies are prone to false positives due to inherently low priors and statistical power. One approach to ameliorate this problem is to seek validation of reported candidate genes across independent studies: genes with repeatedly discovered effects are less likely to be false positives. Inversely, genes reported only as many times as expected by chance alone, while possibly representing novel discoveries, are also more likely to be false positives. We show that, across over 30 genome-wide studies that reported Drosophila and Daphnia genes with possible roles in thermal adaptation, the combined lists of candidate genes and orthologous groups are rapidly approaching the total number of genes and orthologous groups in the respective genomes. This is consistent with the expectation of high frequency of false positives. The majority of these spurious candidates have been identified by one or a few studies, as expected by chance alone. In contrast, a noticeable minority of genes have been identified by numerous studies with the probabilities of such discoveries occurring by chance alone being exceedingly small. For this subset of genes, different studies are in agreement with each other despite differences in the ecological settings, genomic tools and methodology, and reporting thresholds. We provide a reference set of presumed true positives among Drosophila candidate genes and orthologous groups involved in response to changes in temperature, suitable for cross-validation purposes. Despite this approach being prone to false negatives, this list of presumed true positives includes several hundred genes, consistent with the “omnigenic” concept of genetic architecture of complex traits.



中文翻译:

节肢动物热适应候选基因列表中的假阳性和真阳性

由于固有的低先验和统计功效,全基因组研究容易出现误报。改善这个问题的一种方法是在独立研究中寻求对报告的候选基因的验证:重复发现效应的基因不太可能是假阳性。相反,基因报告的次数仅与偶然预期的一样多,虽然可能代表新发现,但也更有可能是假阳性。我们表明,在报告果蝇水蚤的30 多项全基因组研究中可能在热适应中发挥作用的基因,候选基因和直系同源组的组合列表正在迅速接近各自基因组中基因和直系同源组的总数。这与高频率误报的预期一致。这些虚假候选者中的大多数已被一项或几项研究确定,这完全是偶然的。相比之下,大量研究已经确定了少数基因,这些发现仅凭偶然发生的概率非常小。对于这组基因,尽管生态环境、基因组工具和方法以及报告阈值存在差异,但不同的研究彼此一致。我们提供了一组假定的真阳性参考集果蝇候选基因和直系同源组参与响应温度变化,适用于交叉验证目的。尽管这种方法容易出现假阴性,但这个假定的真阳性列表包括数百个基因,与复杂性状遗传结构的“全基因”概念一致。

更新日期:2021-05-08
down
wechat
bug