当前位置: X-MOL 学术Am. J. Hum. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Non-parametric Polygenic Risk Prediction via Partitioned GWAS Summary Statistics.
American Journal of Human Genetics ( IF 8.1 ) Pub Date : 2020-05-28 , DOI: 10.1016/j.ajhg.2020.05.004
Sung Chun 1 , Maxim Imakaev 1 , Daniel Hui 2 , Nikolaos A Patsopoulos 2 , Benjamin M Neale 3 , Sekar Kathiresan 4 , Nathan O Stitziel 5 , Shamil R Sunyaev 1
Affiliation  

In complex trait genetics, the ability to predict phenotype from genotype is the ultimate measure of our understanding of genetic architecture underlying the heritability of a trait. A complete understanding of the genetic basis of a trait should allow for predictive methods with accuracies approaching the trait’s heritability. The highly polygenic nature of quantitative traits and most common phenotypes has motivated the development of statistical strategies focused on combining myriad individually non-significant genetic effects. Now that predictive accuracies are improving, there is a growing interest in the practical utility of such methods for predicting risk of common diseases responsive to early therapeutic intervention. However, existing methods require individual-level genotypes or depend on accurately specifying the genetic architecture underlying each disease to be predicted. Here, we propose a polygenic risk prediction method that does not require explicitly modeling any underlying genetic architecture. We start with summary statistics in the form of SNP effect sizes from a large GWAS cohort. We then remove the correlation structure across summary statistics arising due to linkage disequilibrium and apply a piecewise linear interpolation on conditional mean effects. In both simulated and real datasets, this new non-parametric shrinkage (NPS) method can reliably allow for linkage disequilibrium in summary statistics of 5 million dense genome-wide markers and consistently improves prediction accuracy. We show that NPS improves the identification of groups at high risk for breast cancer, type 2 diabetes, inflammatory bowel disease, and coronary heart disease, all of which have available early intervention or prevention treatments.



中文翻译:


通过分区 GWAS 摘要统计进行非参数多基因风险预测。



在复杂性状遗传学中,从基因型预测表型的能力是我们对性状遗传性背后的遗传结构理解的最终衡量标准。对性状遗传基础的完整理解应该允许预测方法的准确性接近性状的遗传力。数量性状和最常见表型的高度多基因性质推动了统计策略的发展,重点是结合无数个体不显着的遗传效应。现在预测准确性正在提高,人们对这种方法在预测对早期治疗干预有反应的常见疾病风险的实际用途越​​来越感兴趣。然而,现有方法需要个体水平的基因型或依赖于准确指定每种待预测疾病的遗传结构。在这里,我们提出了一种多基因风险预测方法,不需要显式建模任何潜在的遗传结构。我们首先从大型 GWAS 队列中以 SNP 效应大小的形式进行汇总统计。然后,我们删除由于连锁不平衡而产生的汇总统计中的相关结构,并对条件均值效应应用分段线性插值。在模拟和真实数据集中,这种新的非参数收缩(NPS)方法可以可靠地考虑500万个密集全基因组标记的汇总统计中的连锁不平衡,并持续提高预测准确性。我们表明,NPS 可以提高对乳腺癌、2 型糖尿病、炎症性肠病和冠心病高危人群的识别,所有这些疾病都可以进行早期干预或预防治疗。

更新日期:2020-07-02
down
wechat
bug