当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs
bioRxiv - Genomics Pub Date : 2020-08-04 , DOI: 10.1101/2020.08.03.235226
Mika Sakurai-Yageta , Kazuki Kumada , Chinatsu Gocho , Satoshi Makino , Akira Uruno , Shu Tadaka , Ikuko N Motoike , Masae Kimura , Shin Ito , Akihito Otsuki , Akira Narita , Hisaaki Kudo , Yuichi Aoki , Inaho Danjoh , Jun Yasuda , Hiroshi Kawame , Naoko Minegishi , Seizo Koshiba , Nobuo Fuse , Gen Tamiya , Masayuki Yamamoto , Kengo Kinoshita

Background: Increasing the power of genome−wide association studies in diverse populations is important for understanding the genetic determinants of disease risks, and large−scale genotype data are collected by genome cohort and biobank projects all over the world. In particular, ethnic−specific SNP arrays are becoming more important because the use of universal SNP arrays has some limitations in terms of cost−effectiveness and throughput. As part of the Tohoku Medical Megabank Project, which integrates prospective genome cohorts into a biobank, we have been developing a series of Japonica Arrays for genotyping participants based on reference panels constructed from whole−genome sequence data of the Japanese population. Results: We designed a novel version of the SNP Array for the Japanese population, called Japonica Array NEO, comprising a total of 666,883 SNPs, including tag SNPs of autosomes and X chromosome with pseudoautosomal regions, SNPs of Y chromosome and mitochondria, and known disease risk SNPs. Among them, 654,246 tag SNPs were selected from an expanded reference panel of 3,552 Japanese using pairwise r2 of linkage disequilibrium measures. Moreover, 28,298 SNPs were included for the evaluation of previously identified disease risk SNPs from the literature and databases, and those present in the Japanese population were extracted using the reference panel. The imputation performance of Japonica Array NEO was assessed by genotyping 286 Japanese samples. We found that the imputation quality r2 and INFO score in the minor allele frequency bin >2.5%−5% were >0.9 and >0.8, respectively, and >12 million markers were imputed with an INFO score >0.8. After verification, Japonica Arrays were used to efficiently genotype cohort participants from the sample selection to perform a quality assessment of the raw data; approximately 130,000 genotyping data of >150,000 participants has already been obtained. Conclusions: Japonica Array NEO is a promising tool for genotyping the Japanese population with genome−wide coverage, contributing to the development of genetic risk scores for this population and further identifying disease risk alleles among individuals of East Asian ancestry.

中文翻译:

Japonica Array NEO具有更高的全基因组覆盖范围和丰富的疾病风险SNP

背景:提高在不同人群中进行全基因组关联研究的能力对于了解疾病风险的遗传决定因素很重要,并且全世界的基因组队列和生物库项目都在收集大规模的基因型数据。特别是,特定种族的SNP阵列变得越来越重要,因为在成本效益和吞吐量方面,通用SNP阵列的使用受到一些限制。作为将前瞻性基因组队列整合到生物库中的东北医学大银行计划的一部分,我们已经开发了一系列粳稻阵列,用于根据从日本人群的全基因组序列数据构建的参考面板进行基因分型。结果:我们为日本人设计了一种新型的SNP阵列,称为Japonica Array NEO,包括总共666,883个SNP,包括常染色体和具有伪常染色体区的X染色体的标签SNP,Y染色体和线粒体的SNP,以及已知的疾病风险SNP。其中,使用成对r从3,552个日语的扩展参考面板中选择了654,246个标签SNP。连锁不平衡对策的2。此外,还包括28,298个SNP,用于从文献和数据库中评估先前确定的疾病风险SNP,并使用参考面板提取了日本人群中存在的SNP。通过对286个日本样品进行基因分型来评估Japonica Array NEO的插补性能。我们发现归因质量r 2次要等位基因频率仓中> 2.5%−5%的INFO得分和>得分分别为> 0.9和> 0.8,并且以INFO得分> 0.8推算> 1200万个标记。验证后,Japonica Arrays用于从样本选择中有效地对队列参与者进行基因分型,以对原始数据进行质量评估;已经获得了> 150,000名参与者的大约130,000种基因分型数据。结论:Japonica Array NEO是一种有前途的工具,可用于对日本人群进行全基因组覆盖的基因分型,为该人群的遗传风险评分的发展做出贡献,并进一步确定东亚血统个体的疾病风险等位基因。
更新日期:2020-08-05
down
wechat
bug