当前位置: X-MOL 学术Mol. Psychiatry › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GWAS significance thresholds for deep phenotyping studies can depend upon minor allele frequencies and sample size.
Molecular Psychiatry ( IF 11.0 ) Pub Date : 2020-02-17 , DOI: 10.1038/s41380-020-0670-3
Huma Asif 1 , Ney Alliey-Rodriguez 1 , Sarah Keedy 1 , Carol A Tamminga 2 , John A Sweeney 3 , Godfrey Pearlson 4 , Brett A Clementz 5 , Matcheri S Keshavan 6 , Peter Buckley 7 , Chunyu Liu 8 , Benjamin Neale 9 , Elliot S Gershon 1, 10
Affiliation  

An important issue affecting genome-wide association studies with deep phenotyping (multiple correlated phenotypes) is determining the suitable family-wise significance threshold. Straightforward family-wise correction (Bonferroni) of p < 0.05 for 4.3 million genotypes and 335 phenotypes would give a threshold of p < 3.46E−11. This would be too conservative because it assumes all tests are independent. The effective number of tests, both phenotypic and genotypic, must be adjusted for the correlations between them. Spectral decomposition of the phenotype matrix and LD-based correction of the number of tested SNPs are currently used to determine an effective number of tests. In this paper, we compare these calculated estimates with permutation-determined family-wise significance thresholds. Permutations are performed by shuffling individual IDs of the genotype vector for this dataset, to preserve correlation of phenotypes. Our results demonstrate that the permutation threshold is influenced by minor allele frequency (MAF) of the SNPs, and by the number of individuals tested. For the more common SNPs (MAF > 0.1), the permutation family-wise threshold was in close agreement with spectral decomposition methods. However, for less common SNPs (0.05 < MAF ≤ 0.1), the permutation threshold calculated over all SNPs was off by orders of magnitude. This applies to the number of individuals studied (here 777) but not to very much larger numbers. Based on these findings, we propose that the threshold to find a particular level of family-wise significance may need to be established using separate permutations of the actual data for several MAF bins.



中文翻译:

深度表型研究的 GWAS 显着性阈值可能取决于次要等位基因频率和样本量。

影响具有深度表型(多个相关表型)的全基因组关联研究的一个重要问题是确定合适的家族显着性阈值。 对于 430 万个基因型和 335 个表型,p < 0.05的直接家族校正 (Bonferroni)将给出p的阈值 < 3.46E-11。这太保守了,因为它假设所有测试都是独立的。表型和基因型测试的有效数量必须根据它们之间的相关性进行调整。目前使用表型矩阵的光谱分解和基于 LD 的测试 SNP 数量校正来确定有效测试数量。在本文中,我们将这些计算得出的估计值与排列确定的家庭显着性阈值进行比较。排列是通过改组该数据集基因型向量的单个 ID 来执行的,以保持表型的相关性。我们的结果表明,排列阈值受 SNP 的次要等位基因频率 (MAF) 和测试的个体数量的影响。对于更常见的 SNP (MAF > 0.1),排列家族阈值与光谱分解方法非常一致。然而,对于不太常见的 SNP (0.05 < MAF ≤ 0.1),对所有 SNP 计算的排列阈值相差几个数量级。这适用于所研究的人数(此处为 777),但不适用于更大的人数。基于这些发现,我们建议可能需要使用多个 MAF 箱的实际数据的单独排列来确定找到特定级别的家庭意义的阈值。

更新日期:2020-02-17
down
wechat
bug