当前位置: X-MOL 学术Plant J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Imputation of 3 million SNPs in the Arabidopsis regional mapping population.
The Plant Journal ( IF 6.2 ) Pub Date : 2020-02-11 , DOI: 10.1111/tpj.14659
Bader Arouisse 1 , Arthur Korte 2 , Fred van Eeuwijk 1 , Willem Kruijer 1
Affiliation  

Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross-validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high-quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome-wide association studies on five stress-related traits and could identify novel candidate genes.

中文翻译:

拟南芥区域绘图群体中 300 万个 SNP 的估算。

自然变异已成为识别导致表型变异的遗传变异的主要资源。区域制图(RegMap)种群是研究拟南芥自然变异最重要的种群之一,已被应用于大量的关联研究和气候适应研究中。然而,作为 1001 基因组 (1001G) 项目的一部分,只有 413 个 RegMap 种质已被完全测序,而其余 894 个种质仅使用 Affymetrix 250k 芯片进行了基因分型。因此,大多数涉及 RegMap 的关联研究要么仅限于测序种质,从而降低功效,要么依赖于一组有限的 SNP。在这里,我们使用 1001G 项目的 1135 个种质作为参考组,将数百万个 SNP 归咎于 RegMap 独有的 894 个种质。我们使用一种新颖的交叉验证方案来评估插补准确性,我们证明该方案提供了比现有方法更可靠的准确性衡量标准。过滤掉低准确度的 SNP 后,我们获得了 2029 个种质和 300 万个标记的高质量基因型信息。为了说明这些估算数据的好处,我们对五个与压力相关的性状重新进行了全基因组关联研究,并可以识别出新的候选基因。
更新日期:2019-12-19
down
wechat
bug