当前位置: X-MOL 学术J. Appl. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The impact of disregarding family structure on genome-wide association analysis of complex diseases in cohorts with simple pedigrees.
Journal of Applied Genetics ( IF 2.0 ) Pub Date : 2019-11-21 , DOI: 10.1007/s13353-019-00526-7
Alireza Nazarian 1 , Konstantin G Arbeev 1 , Alexander M Kulminski 1
Affiliation  

The generalized linear mixed models (GLMMs) methodology is the standard framework for genome-wide association studies (GWAS) of complex diseases in family-based cohorts. Fitting GLMMs in very large cohorts, however, can be computationally demanding. Also, the modified versions of GLMM using faster algorithms may underperform, for instance when a single nucleotide polymorphism (SNP) is correlated with fixed-effects covariates. We investigated the extent to which disregarding family structure may compromise GWAS in cohorts with simple pedigrees by contrasting logistic regression models (i.e., with no family structure) to three LMMs-based ones. Our analyses showed that the logistic regression models in general resulted in smaller P values compared with the LMMs-based models; however, the differences in P values were mostly minor. Disregarding family structure had little impact on determining disease-associated SNPs at genome-wide level of significance (i.e., P < 5E-08) as the four P values resulted from the tested methods for any SNP were all below or all above 5E-08. Nevertheless, larger discrepancies were detected between logistic regression and LMMs-based models at suggestive level of significance (i.e., of 5E-08 ≤ P < 5E-06). The SNP effects estimated by the logistic regression models were not statistically different from those estimated by GLMMs that implemented Wald’s test. However, several SNP effects were significantly different from their counterparts in LMMs analyses. We suggest that fitting GLMMs with Wald’s test on a pre-selected subset of SNPs obtained from logistic regression models can ensure the balance between the speed of analyses and the accuracy of parameters.

中文翻译:

忽略家族结构对具有简单血统的队列中复杂疾病的全基因组关联分析的影响。

广义线性混合模型(GLMM)方法是基于家族的复杂疾病的全基因组关联研究(GWAS)的标准框架。但是,将GLMM装配到非常大型的队列中可能会在计算上有很高的要求。此外,例如,当单核苷酸多态性(SNP)与固定效应协变量相关时,使用更快算法的GLMM修改版可能会表现不佳。我们通过将逻辑回归模型(即没有家庭结构)与基于三个LMM的模型进行对比,研究了无视家庭结构可能损害具有简单血统的队列中的GWAS的程度。我们的分析表明,与基于LMM的模型相比,逻辑回归模型通常导致较小的P值;但是,在P值主要是次要的。忽略家族结构对确定全基因组显着性水平(即P  <5E-08)与疾病相关的SNP几乎没有影响,因为任何SNP的测试方法得出的四个P值均低于或高于5E-08 。的显着性暗示的水平(即5E-08≤基于LMMS,然而,回归之间检测到较大的差异模型 P <5E-06)。由逻辑回归模型估算的SNP效果与实施Wald检验的GLMM估算的SNP效果在统计上没有差异。但是,在LMM分析中,几种SNP效应与它们的对应效应显着不同。我们建议用Wald检验对从逻辑回归模型获得的SNP的预选子集拟合GLMM可以确保分析速度和参数准确性之间的平衡。
更新日期:2019-11-21
down
wechat
bug