当前位置: X-MOL 学术Hum. Mol. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Population-specific reference panels are crucial for genetic analyses: an example of the CREBRF locus in Native Hawaiians.
Human Molecular Genetics ( IF 3.1 ) Pub Date : 2020-06-03 , DOI: 10.1093/hmg/ddaa083
Meng Lin 1 , Christian Caberto 2 , Peggy Wan 1 , Yuqing Li 3 , Annette Lum-Jones 2 , Maarit Tiirikainen 2 , Loreall Pooler 1 , Brooke Nakamura 1 , Xin Sheng 1 , Jacqueline Porcel 1 , Unhee Lim 2 , Veronica Wendy Setiawan 1 , Loïc Le Marchand 2 , Lynne R Wilkens 2 , Christopher A Haiman 1 , Iona Cheng 3 , Charleston W K Chiang 1, 4
Affiliation  

Statistical imputation applied to genome-wide array data is the most cost-effective approach to complete the catalog of genetic variation in a study population. However, imputed genotypes in underrepresented populations incur greater inaccuracies due to ascertainment bias and a lack of representation among reference individuals, further contributing to the obstacles to study these populations. Here we examined the consequences due to the lack of representation by genotyping in a large number of self-reported Native Hawaiians (N = 3693) a functionally important, Polynesian-specific variant in the CREBRF gene, rs373863828. We found the derived allele was significantly associated with several adiposity traits with large effects (e.g. ~ 1.28 kg/m2 per allele in body mass index as the most significant; P = 7.5 × 10−5), consistent with the original findings in Samoans. Due to the current absence of Polynesian representation in publicly accessible reference sequences, rs373863828 or its proxies could not be tested through imputation using these existing resources. Moreover, the association signals at the entire CREBRF locus could not be captured by alternative approaches, such as admixture mapping. In contrast, highly accurate imputation can be achieved even if a small number (<200) of internally constructed Polynesian reference individuals were available; this would increase sample size and improve the statistical evidence of associations. Taken together, our results suggest the alarming possibility that lack of representation in reference panels could inhibit discovery of functionally important loci such as CREBRF. Yet, they could be easily detected and prioritized with improved representation of diverse populations in sequencing studies.

中文翻译:


特定人群的参考面板对于遗传分析至关重要:夏威夷原住民中 CREBRF 基因座的示例。



应用于全基因组阵列数据的统计插补是完成研究群体遗传变异目录的最具成本效益的方法。然而,由于确定偏差和参考个体缺乏代表性,在代表性不足的人群中估算的基因型会产生更大的不准确性,进一步增加了研究这些人群的障碍。在这里,我们检查了由于在大量自我报告的夏威夷原住民 ( N = 3693) 中缺乏代表性的基因分型所造成的后果,这是CREBRF基因 rs373863828 中功能重要、波利尼西亚特有的变异。我们发现衍生的等位基因与几个影响较大的肥胖性状显着相关(例如,体重指数中每个等位基因~1.28 kg/m 2最为显着; P = 7.5 × 10 -5 ),与萨摩亚人的原始发现一致。由于目前在可公开访问的参考序列中缺乏波利尼西亚代表性,因此无法使用这些现有资源通过插补来测试 rs373863828 或其代理。此外,整个CREBRF基因座的关联信号无法通过其他方法(例如混合作图)捕获。相比之下,即使有少量(<200)内部构建的波利尼西亚参考个体,也可以实现高度准确的插补;这将增加样本量并改善关联的统计证据。总而言之,我们的结果表明,参考组中缺乏代表性可能会抑制功能重要基因座(例如CREBRF )的发现,这一可能性令人震惊。 然而,通过改善测序研究中不同人群的代表性,可以轻松检测到它们并确定优先级。
更新日期:2020-08-04
down
wechat
bug