当前位置: X-MOL 学术Hum. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A population-based approach for gene prioritization in understanding complex traits.
Human Genetics ( IF 5.3 ) Pub Date : 2020-03-30 , DOI: 10.1007/s00439-020-02152-4
Massimo Mezzavilla 1 , Massimiliano Cocca 1 , Francesca Guidolin 2 , Paolo Gasparini 1, 3
Affiliation  

Gene prioritization is the process of determining which variants and genes identified in genetic analyses are likely to cause a disease or a variation in a phenotype. For many genes, neither in vitro nor in vivo testing is available, thus assessing their pathogenic role could be challenging, leading to false-positive or false-negative results. In this paper, we propose an innovative score of gene prioritization based on the population of interest. We introduce the concept of singleton-cohort variants (SC variant), a variant that has allele count equal to one in the cohort under study. The difference between the normalized count of SC variants in the coding region and the normalized count of SC variants in the non-coding region should give a hint regarding the level of constraints for that gene in a specific population. This scoring system is negative when there are constraints that allow the presence of SC variants only in the non-coding region; on the contrary, it is positive when there are no constraints. A complimentary score is the sum of SC variants normalized count in both coding and non-coding regions, which could be used as a proxy of positive or strong purifying selection in a specific population. Our methodology showed a high level of constraining for genes such as USP34 in all subpopulations tested (1000 G dataset). In contrast, some genes showed a high negative score only in specific populations, e.g., MYT1L in Europeans, UBR5 in East Asians, and FBXO11 in Africans.

中文翻译:

一种基于人群的方法,用于了解复杂性状的基因优先次序。

基因优先排序是确定在遗传分析中鉴定出哪些变体和基因很可能引起疾病或表型变异的过程。对于许多基因,无论是体外测试还是体内测试均无法使用,因此评估其致病作用可能具有挑战性,从而导致假阳性或假阴性结果。在本文中,我们提出了一个基于目标人群的基因优先排序创新分数。我们介绍了单例队列变体(SC变体)的概念,该变体的等位基因计数等于正在研究的队列中的一个。编码区中SC变体的归一化计数与非编码区中SC变体的归一化计数之间的差异应该提示有关特定群体中该基因的限制水平。当存在约束条件时,仅在非编码区域中存在SC变体,则此评分系统为负。相反,如果没有约束,它是积极的。互补分数是编码区和非编码区中SC变体归一化计数的总和,可用作特定人群中阳性或强纯化选择的代理。我们的方法论表明,在所有测试的亚人群中(1000 G数据集)对USP34等基因的约束水平很高。相反,某些基因仅在特定人群中显示高阴性评分,例如欧洲人的MYT1L,东亚人的UBR5和非洲人的FBXO11。互补分数是编码区和非编码区中SC变体归一化计数的总和,可用作特定人群中阳性或强纯化选择的代理。我们的方法论表明,在所有测试的亚人群中(1000 G数据集)对USP34等基因的约束水平很高。相反,某些基因仅在特定人群中显示高阴性评分,例如欧洲人的MYT1L,东亚人的UBR5和非洲人的FBXO11。互补分数是编码区和非编码区中SC变体归一化计数的总和,可用作特定人群中阳性或强纯化选择的代理。我们的方法论表明,在所有测试的亚人群中(1000 G数据集)对USP34等基因的约束水平很高。相反,某些基因仅在特定人群中显示高阴性评分,例如欧洲人的MYT1L,东亚人的UBR5和非洲人的FBXO11。
更新日期:2020-04-21
down
wechat
bug