当前位置: X-MOL 学术Am. J. Hum. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast estimation of genetic correlation for biobank-scale data
American Journal of Human Genetics ( IF 9.8 ) Pub Date : 2021-12-02 , DOI: 10.1016/j.ajhg.2021.11.015
Yue Wu 1 , Kathryn S Burch 2 , Andrea Ganna 3 , Päivi Pajukanta 4 , Bogdan Pasaniuc 5 , Sriram Sankararaman 6
Affiliation  

Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE (scalable genetic correlation estimator), a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale; it achieves a 44% reduction in standard error relative to LD-score regression (LDSC) and a 20% reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK Biobank dataset, consisting of 300 K individuals and 500 K SNPs, in a few h (orders of magnitude faster than methods that analyze individual data, such as GCTA). Across 780 pairs of traits in 291,273 unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).



中文翻译:

快速估计生物样本库规模数据的遗传相关性

遗传相关性是努力理解复杂性状之间关系的重要参数。当前分析个体基因型数据以估计遗传相关性的方法难以扩展到大型数据集。分析汇总数据的方法虽然计算效率高,但往往会产生遗传相关性的估计,但精度会降低。我们提出了 SCORE(可扩展遗传相关估计器),这是一种可扩展且准确的遗传相关矩估计的随机方法。SCORE 获得了相对于可大规模应用的汇总统计方法更精确的遗传相关性估计;它达到了一个44%相对于 LD 评分回归 (LDSC) 和20%相对于高清似然 (HDL) 的减少(所有模拟的平均值)。SCORE 的效率可以计算 UK Biobank 数据集上的遗传相关性,包括300K 个人和500K SNP,在几个小时内(比分析单个数据的方法快几个数量级,例如 GCTA)。跨越 780 对特征291,273在英国生物库中无关的英国白人个体,SCORE 确定了 LDSC 上另外 200 对性状之间的显着遗传相关性(超出了两者确定的 245 对)。

更新日期:2022-01-06
down
wechat
bug