A generalized linear mixed model association tool for biobank-scale data,Nature Genetics

当前位置： X-MOL 学术 › Nat. Genet. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A generalized linear mixed model association tool for biobank-scale data
Nature Genetics ( IF 31.7 ) Pub Date : 2021-11-04 , DOI: 10.1038/s41588-021-00954-4
Longda Jiang _{1,

2} , Zhili Zheng ₁ , Hailing Fang _{2,

3} , Jian Yang _{1,

2,

3}

Affiliation

Compared with linear mixed model-based genome-wide association (GWA) methods, generalized linear mixed model (GLMM)-based methods have better statistical properties when applied to binary traits but are computationally much slower. In the present study, leveraging efficient sparse matrix-based algorithms, we developed a GLMM-based GWA tool, fastGWA-GLMM, that is severalfold to orders of magnitude faster than the state-of-the-art tools when applied to the UK Biobank (UKB) data and scalable to cohorts with millions of individuals. We show by simulation that the fastGWA-GLMM test statistics of both common and rare variants are well calibrated under the null, even for traits with extreme case–control ratios. We applied fastGWA-GLMM to the UKB data of 456,348 individuals, 11,842,647 variants and 2,989 binary traits (full summary statistics available at http://fastgwa.info/ukbimpbin), and identified 259 rare variants associated with 75 traits, demonstrating the use of imputed genotype data in a large cohort to discover rare variants for binary complex traits.

中文翻译：

用于生物样本库规模数据的广义线性混合模型关联工具

与基于线性混合模型的全基因组关联 (GWA) 方法相比，基于广义线性混合模型 (GLMM) 的方法在应用于二元性状时具有更好的统计特性，但计算速度要慢得多。在本研究中，利用基于稀疏矩阵的高效算法，我们开发了一种基于 GLMM 的 GWA 工具 fastGWA-GLMM，当应用于英国生物银行时，它比最先进的工具快几倍到数量级(UKB) 数据并可扩展到拥有数百万个人的群组。我们通过模拟表明，常见和罕见变体的 fastGWA-GLMM 测试统计数据在零下得到了很好的校准，即使对于具有极端病例控制比的性状也是如此。我们将 fastGWA-GLMM 应用于 456,348 个个体、11,842,647 个变体和 2 个的 UKB 数据，

更新日期：2021-11-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11