当前位置: X-MOL 学术Genet. Sel. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Imputation accuracy to whole-genome sequence in Nellore cattle
Genetics Selection Evolution ( IF 3.6 ) Pub Date : 2021-03-12 , DOI: 10.1186/s12711-021-00622-5
Gerardo A Fernandes Júnior 1 , Roberto Carvalheiro 1, 2 , Henrique N de Oliveira 1, 2 , Mehdi Sargolzaei 3, 4 , Roy Costilla 5 , Ricardo V Ventura 6 , Larissa F S Fonseca 1 , Haroldo H R Neves 7 , Ben J Hayes 5 , Lucia G de Albuquerque 1, 2
Affiliation  

A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.

中文翻译:

内洛牛全基因组序列的推算准确性

探索动物完整DNA序列以进行遗传评估的一种经济有效的策略是对一个种群的关键祖先进行测序,然后采用插补机制来推断在具有单核苷酸多态性基因型的目标动物种群中最初未报道的标记基因型。 (SNP)面板。此过程的可行性取决于该人群中基因型估算的准确性,尤其是对于可能发生在基因或调控区域内的低频率的潜在因果突变。本研究的目的是调查内洛尔肉牛种群中序列水平的插补准确性,包括注释类中更可能具有功能的变异的插补准确性。使用151个关键序列Nellore父亲的信息来评估从牛HD BeadChip SNP(〜777 k)到全基因组序列的估算准确性。父本的选择旨在优化基因型数据库的估算准确性,该数据库由大约10,000个基因型Nellore动物组成。使用两种计算方法进行基因型估算:FImpute3和Minimac4(在使用Eagle进行定相后)。使用五重交叉验证方案评估估算的准确性,并通过观察到的基因型和估算的基因型之间的平方相关性(由个体和SNP计算)来测量。SNP被分为一系列注释,并且还评估了每个注释类别中的归因准确性。使用FImpute3(0。94)和Minimac4(0.95)。平均而言,Minimac4可以更准确地估算出常见变异(次等位基因频率(MAF)> 0.03),而FImpute3可以更精确地估算出低频变异(MAF≤0.03)。固有的Minimac4 Rsq插补质量统计数据似乎是经验性Minimac4插补精度的良好指标。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。Minimac4更准确地估算了常见变异(次要等位基因频率(MAF)> 0.03),而FImpute3更准确地估算了低频变异(MAF≤0.03)。固有的Minimac4 Rsq插补质量统计数据似乎是经验性Minimac4插补精度的良好指标。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。Minimac4更准确地估算了常见变异(次要等位基因频率(MAF)> 0.03),而FImpute3更准确地估算了低频变异(MAF≤0.03)。固有的Minimac4 Rsq插补质量统计数据似乎是经验性Minimac4插补精度的良好指标。两种软件都为所有类别的生物注释提供了较高的平均SNP归因准确度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。03)由FImpute3进行了更准确的估算。固有的Minimac4 Rsq插补质量统计数据似乎是经验性Minimac4插补精度的良好指标。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。03)由FImpute3进行了更准确的估算。固有的Minimac4 Rsq插补质量统计数据似乎是经验性Minimac4插补精度的良好指标。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。两种软件都为所有类别的生物注释提供了较高的平均SNP归因精度。我们的结果表明,在Nellore肉牛中对全基因组序列进行插补是可行的,因为预计每个人的插补精度都很高。SNP方式的插补精度取决于软件,特别是对于稀有变体。插补的准确性似乎相对独立于注释分类。
更新日期:2021-03-12
down
wechat
bug