当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An evaluation of the interpretability and predictive performance of the BayesR model for genomic prediction
bioRxiv - Genomics Pub Date : 2020-10-23 , DOI: 10.1101/2020.10.23.351700
Fanny Mollandin , Andrea Rau , Pascal Croiseau

Technological advances and decreasing costs have led to the rise of increasingly dense genotyping data, making feasible the identification of potential causal markers. Custom genotyping chips, which combine medium-density genotypes with a custom genotype panel, can capitalize on these candidates to potentially yield improved accuracy and interpretability in genomic prediction. A particularly promising model to this end is BayesR, which divides markers into four effect size classes. BayesR has been shown to yield accurate predictions and promise for quantitative trait loci (QTL) mapping in real data applications, but an extensive benchmarking in simulated data is currently lacking. Based on a set of real genotypes, we generated simulated data under a variety of genetic architectures, phenotype heritabilities, and we evaluated the impact of excluding or including causal markers among the genotypes. We define several statistical criteria for QTL mapping, including several based on sliding windows to account for linkage disequilibrium. We compare and contrast these statistics and their ability to accurately prioritize known causal markers. Overall, we confirm the strong predictive performance for BayesR in moderately to highly heritable traits, particularly for 50k custom data. In cases of low heritability or weak linkage disequilibrium with the causal marker in 50k genotypes, QTL mapping is a challenge, regardless of the criterion used. BayesR is a promising approach to simultaneously obtain accurate predictions and interpretable classifications of SNPs into effect size classes. We illustrated the performance of BayesR in a variety of simulation scenarios, and compared the advantages and limitations of each.

中文翻译:

对BayesR模型进行基因组预测的可解释性和预测性能的评估

技术的进步和成本的降低已导致越来越密集的基因分型数据的出现,使潜在的因果标志物的鉴定成为可能。定制基因分型芯片将中等密度基因型与定制基因型面板相结合,可以利用这些候选基因,从而潜在地提高基因组预测的准确性和可解释性。为此目的,一个特别有希望的模型是BayesR,它将标记分为四个效应大小类别。已经证明,BayesR可以在真实数据应用程序中产生准确的预测并有望用于定量性状基因座(QTL)映射,但是目前缺乏在模拟数据中进行广泛基准测试的功能。基于一组真实的基因型,我们在各种遗传结构,表型遗传力,我们评估了在基因型中排除或包括因果标志的影响。我们为QTL映射定义了一些统计标准,其中包括一些基于滑动窗口的方法,以解决连锁不平衡问题。我们比较并对比了这些统计数据及其准确确定已知因果标记优先级的能力。总体而言,我们确认了BayesR在中等至高度遗传性状方面的强预测性能,尤其是对于5万个自定义数据而言。在50k基因型中遗传力低或因果标记弱连锁不平衡的情况下,无论使用何种标准,QTL定位都是一项挑战。BayesR是一种有前途的方法,可以同时将SNP的准确预测和可解释的分类分为效应大小类别。
更新日期:2020-10-27
down
wechat
bug