当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal strategies for learning multi-ancestry polygenic scores vary across traits
bioRxiv - Genomics Pub Date : 2022-04-07 , DOI: 10.1101/2021.01.15.426781
B.C.L. Lehmann , M. Mackintosh , G. McVean , C.C. Holmes

Polygenic scores (PGSs) are individual-level measures that aggregate the genome-wide genetic predisposition to a given trait. As PGS have predominantly been developed using European-ancestry samples, trait prediction using such European ancestry-derived PGS is less accurate in non-European ancestry individuals. Although there has been recent progress in combining multiple PGS trained on distinct populations, the problem of how to maximize performance given a multiple-ancestry cohort is largely unexplored. Here, we investigate the effect of sample size and ancestry composition on PGS performance for fifteen traits in UK Biobank. For some traits, PGS estimated using a relatively small African-ancestry training set outperformed, on an African-ancestry test set, PGS estimated using a much larger European-ancestry only training set. We observe similar, but not identical, results when considering other minority-ancestry groups within UK Biobank. Our results emphasise the importance of targeted data collection from underrepresented groups in order to address existing disparities in PGS performance.

中文翻译:

学习多血统多基因分数的最佳策略因性状而异

多基因评分 (PGS) 是个体水平的衡量标准,可将全基因组的遗传易感性汇总到给定性状。由于 PGS 主要是使用欧洲血统样本开发的,因此使用这种欧洲血统衍生的 PGS 进行性状预测在非欧洲血统个体中不太准确。尽管最近在结合针对不同人群训练的多个 PGS 方面取得了进展,但如何在给定多血统队列的情况下最大限度地提高性能的问题在很大程度上尚未得到探索。在这里,我们研究了样本量和血统组成对英国生物库中 15 个性状的 PGS 表现的影响。对于某些特征,PGS 估计使用相对较小的非洲血统训练集优于在非洲血统测试集上,PGS 估计使用更大的仅欧洲血统训练集。我们观察到类似的情况,但在考虑英国生物银行内的其他少数族裔群体时的结果并不相同。我们的结果强调了从代表性不足的群体中收集有针对性的数据的重要性,以解决 PGS 表现中现有的差异。
更新日期:2022-04-07
down
wechat
bug