当前位置: X-MOL 学术Genet. Sel. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
Genetics Selection Evolution ( IF 4.1 ) Pub Date : 2021-02-26 , DOI: 10.1186/s12711-021-00607-4
Theo Meuwissen 1 , Irene van den Berg 2 , Mike Goddard 2, 3
Affiliation  

Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.

中文翻译:

关于使用全基因组序列数据进行QTL的杂交基因组预测和精细映射

通过第二代重测序技术,1000个基因组计划以及来自较低标记密度的大规模基因型推算,全基因组序列(WGS)数据越来越多地用于动植物育种和人类遗传学中的大量个体。在这里,我们提出了一种可变选择基因组预测方法的计算快速实现方式,该方法可以处理35,000多个个体的WGS数据,测试其对杂种预测的准确性,并评估其定量性状基因座(QTL)定位精度。蒙特卡洛马可夫链(MCMC)变量选择模型(Bayes GC)同时符合基因组最佳线性无偏预测(GBLUP)项,即一种多基因效应,其相关性由基因组关系矩阵(G)和贝叶斯C项描述, IE 模型选择的一组具有较大影响的单核苷酸多态性(SNP)。通过Metropolis-Hastings采样提高了计算速度,该采样将计算定向到SNP,这些SNP是先验的,最有可能包含在模型中。通过运行许多相对较短的MCMC链,速度也得到了提高。通过以二进制形式存储基因型矩阵,可以减少内存需求。该模型在包含荷斯坦,泽西岛和澳大利亚红牛的WGS数据集上进行了测试。数据包含35,549个个体的4,809,520个基因型,以及其牛奶,脂肪和蛋白质的产量以及脂肪和蛋白质的百分比特征。与品种内的预测相比,使用杂交GBLUP的泽西岛个体的预测准确性提高了1.5%。使用WGS代替600 k SNP芯片数据,澳大利亚红牛的平均准确率提高了3%。通过定位具有最高后验概率的SNP定位到模型中,从而精确定位QTL。重新发现了文献中已知的各种QTL,并在20号染色体上以34.501126 Mb的浓度发现了一个新的影响牛奶生产的SNP。由于较高的制图精度,很明显,在五个乳制品性状中,许多发现的QTL是相同的。与GBLUP相比,杂交Bayes GC基因组预测提高了预测准确性。杂种WGS数据和贝叶斯基因组预测的结合被证明对QTL的精细映射非常有效。通过定位具有最高后验概率的SNP定位到模型中,从而精确定位QTL。重新发现了文献中已知的各种QTL,并在20号染色体上以34.501126 Mb的浓度发现了一个新的影响牛奶生产的SNP。由于较高的制图精度,很明显,在五个乳制品性状中,许多发现的QTL是相同的。与GBLUP相比,杂交Bayes GC基因组预测提高了预测准确性。杂种WGS数据和贝叶斯基因组预测的结合被证明对QTL的精细映射非常有效。通过定位具有最高后验概率的SNP定位到模型中,从而精确定位QTL。重新发现了文献中已知的各种QTL,并在20号染色体上以34.501126 Mb的浓度发现了一个新的影响牛奶生产的SNP。由于较高的制图精度,很明显,在五个乳制品性状中,许多发现的QTL是相同的。与GBLUP相比,杂交Bayes GC基因组预测提高了预测准确性。杂种WGS数据和贝叶斯基因组预测的结合被证明对QTL的精细映射非常有效。很明显,在五个乳制品特性中,许多发现的QTL是相同的。与GBLUP相比,杂交Bayes GC基因组预测提高了预测准确性。杂种WGS数据和贝叶斯基因组预测的结合被证明对QTL的精细映射非常有效。很明显,在五个乳制品特性中,许多发现的QTL是相同的。与GBLUP相比,杂交Bayes GC基因组预测提高了预测准确性。杂种WGS数据和贝叶斯基因组预测的结合被证明对QTL的精细映射非常有效。
更新日期:2021-02-26
down
wechat
bug