当前位置: X-MOL 学术Genet. Sel. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Impact of linkage disequilibrium heterogeneity along the genome on genomic prediction and heritability estimation
Genetics Selection Evolution ( IF 3.6 ) Pub Date : 2022-06-27 , DOI: 10.1186/s12711-022-00737-3
Duanyang Ren 1 , Xiaodian Cai 1 , Qing Lin 1 , Haoqiang Ye 1 , Jinyan Teng 1 , Jiaqi Li 1 , Xiangdong Ding 2 , Zhe Zhang 1
Affiliation  

Compared to medium-density single nucleotide polymorphism (SNP) data, high-density SNP data contain abundant genetic variants and provide more information for the genetic evaluation of livestock, but it has been shown that they do not confer any advantage for genomic prediction and heritability estimation. One possible reason is the uneven distribution of the linkage disequilibrium (LD) along the genome, i.e., LD heterogeneity among regions. The aim of this study was to effectively use genome-wide SNP data for genomic prediction and heritability estimation by using models that control LD heterogeneity among regions. The LD-adjusted kinship (LDAK) and LD-stratified multicomponent (LDS) models were used to control LD heterogeneity among regions and were compared with the classical model that has no such control. Simulated and real traits of 2000 dairy cattle individuals with imputed high-density (770K) SNP data were used. Five types of phenotypes were simulated, which were controlled by very strongly, strongly, moderately, weakly and very weakly tagged causal variants, respectively. The performances of the models with high- and medium-density (50K) panels were compared to verify that the models that controlled LD heterogeneity among regions were more effective with high-density data. Compared to the medium-density panel, the use of the high-density panel did not improve and even decreased prediction accuracies and heritability estimates from the classical model for both simulated and real traits. Compared to the classical model, LDS effectively improved the accuracy of genomic predictions and unbiasedness of heritability estimates, regardless of the genetic architecture of the trait. LDAK applies only to traits that are mainly controlled by weakly tagged causal variants, but is still less effective than LDS for this type of trait. Compared with the classical model, LDS improved prediction accuracy by about 13% for simulated phenotypes and by 0.3 to ~ 10.7% for real traits with the high-density panel, and by ~ 1% for simulated phenotypes and by − 0.1 to ~ 6.9% for real traits with the medium-density panel. Grouping SNPs based on regional LD to construct the LD-stratified multicomponent model can effectively eliminate the adverse effects of LD heterogeneity among regions, and greatly improve the efficiency of high-density SNP data for genomic prediction and heritability estimation.

中文翻译:

基因组连锁不平衡异质性对基因组预测和遗传力估计的影响

与中密度单核苷酸多态性(SNP)数据相比,高密度SNP数据包含丰富的遗传变异,为家畜遗传评价提供更多信息,但已表明它们对基因组预测和遗传力没有任何优势估计。一个可能的原因是连锁不平衡(LD)沿基因组分布不均匀,即区域之间的LD异质性。本研究的目的是通过使用控制区域间 LD 异质性的模型,有效地利用全基因组 SNP 数据进行基因组预测和遗传力估计。LD调整的亲属关系(LDAK)和LD分层多成分(LDS)模型用于控制区域间的LD异质性,并与没有这种控制的经典模型进行比较。使用推算高密度 (770K) SNP 数据的 2000 头奶牛个体的模拟和真实性状。模拟了五种表型,分别由非常强、强、中、弱和非常弱标记的因果变体控制。比较了具有高密度和中密度 (50K) 面板的模型的性能,以验证控制区域间 LD 异质性的模型对高密度数据更有效。与中等密度面板相比,使用高密度面板并没有提高甚至降低经典模型对模拟和真实性状的预测准确性和遗传力估计。与经典模型相比,LDS有效提高了基因组预测的准确性和遗传力估计的无偏性,无论性状的遗传结构如何。LDAK 仅适用于主要受弱标记因果变异控制的性状,但对于此类性状仍不如 LDS 有效。与经典模型相比,LDS 将模拟表型的预测准确度提高了约 13%,使用高密度面板将真实性状的预测准确度提高了 0.3 到 ~ 10.7%,模拟表型提高了 ~ 1% 和 - 0.1 到 ~ 6.9%对于中等密度面板的真实性状。基于区域 LD 对 SNP 进行分组构建 LD 分层多成分模型,可以有效消除区域间 LD 异质性的不利影响,大大提高高密度 SNP 数据用于基因组预测和遗传力估计的效率。LDAK 仅适用于主要受弱标记因果变异控制的性状,但对于此类性状仍不如 LDS 有效。与经典模型相比,LDS 将模拟表型的预测准确度提高了约 13%,使用高密度面板将真实性状的预测准确度提高了 0.3 到 ~ 10.7%,模拟表型提高了 ~ 1% 和 - 0.1 到 ~ 6.9%对于中等密度面板的真实性状。基于区域 LD 对 SNP 进行分组构建 LD 分层多成分模型,可以有效消除区域间 LD 异质性的不利影响,大大提高高密度 SNP 数据用于基因组预测和遗传力估计的效率。LDAK 仅适用于主要受弱标记因果变异控制的性状,但对于此类性状仍不如 LDS 有效。与经典模型相比,LDS 将模拟表型的预测准确度提高了约 13%,使用高密度面板将真实性状的预测准确度提高了 0.3 到 ~ 10.7%,模拟表型提高了 ~ 1% 和 - 0.1 到 ~ 6.9%对于中等密度面板的真实性状。基于区域 LD 对 SNP 进行分组构建 LD 分层多成分模型,可以有效消除区域间 LD 异质性的不利影响,大大提高高密度 SNP 数据用于基因组预测和遗传力估计的效率。LDS 将模拟表型的预测准确度提高了约 13%,高密度面板的真实性状提高了 0.3 到 ~ 10.7%,模拟表型提高了 ~ 1%,中等真实性状提高了 - 0.1 到 ~ 6.9% -密度面板。基于区域 LD 对 SNP 进行分组构建 LD 分层多成分模型,可以有效消除区域间 LD 异质性的不利影响,大大提高高密度 SNP 数据用于基因组预测和遗传力估计的效率。LDS 将模拟表型的预测准确度提高了约 13%,高密度面板的真实性状提高了 0.3 到 ~ 10.7%,模拟表型提高了 ~ 1%,中等真实性状提高了 - 0.1 到 ~ 6.9% -密度面板。基于区域 LD 对 SNP 进行分组构建 LD 分层多成分模型,可以有效消除区域间 LD 异质性的不利影响,大大提高高密度 SNP 数据用于基因组预测和遗传力估计的效率。
更新日期:2022-06-28
down
wechat
bug