当前位置: X-MOL 学术Nature › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
In silico saturation mutagenesis of cancer genes
Nature ( IF 50.5 ) Pub Date : 2021-07-28 , DOI: 10.1038/s41586-021-03771-1
Ferran Muiños 1 , Francisco Martínez-Jiménez 1 , Oriol Pich 1 , Abel Gonzalez-Perez 1, 2 , Nuria Lopez-Bigas 1, 2, 3
Affiliation  

Despite the existence of good catalogues of cancer genes1,2, identifying the specific mutations of those genes that drive tumorigenesis across tumour types is still a largely unsolved problem. As a result, most mutations identified in cancer genes across tumours are of unknown significance to tumorigenesis3. We propose that the mutations observed in thousands of tumours—natural experiments testing their oncogenic potential replicated across individuals and tissues—can be exploited to solve this problem. From these mutations, features that describe the mechanism of tumorigenesis of each cancer gene and tissue may be computed and used to build machine learning models that encapsulate these mechanisms. Here we demonstrate the feasibility of this solution by building and validating 185 gene–tissue-specific machine learning models that outperform experimental saturation mutagenesis in the identification of driver and passenger mutations. The models and their assessment of each mutation are designed to be interpretable, thus avoiding a black-box prediction device. Using these models, we outline the blueprints of potential driver mutations in cancer genes, and demonstrate the role of mutation probability in shaping the landscape of observed driver mutations. These blueprints will support the interpretation of newly sequenced tumours in patients and the study of the mechanisms of tumorigenesis of cancer genes across tissues.



中文翻译:

癌症基因的计算机饱和诱变

尽管存在良好的癌症基因目录1,2,但识别那些驱动肿瘤发生的基因的特定突变仍然是一个很大程度上未解决的问题。因此,在跨肿瘤的癌症基因中发现的大多数突变对肿瘤发生的意义未知3. 我们建议,在数千个肿瘤中观察到的突变——测试它们在个体和组织之间复制的致癌潜力的自然实验——可以用来解决这个问题。从这些突变中,可以计算描述每个癌症基因和组织的肿瘤发生机制的特征,并用于构建封装这些机制的机器学习模型。在这里,我们通过构建和验证 185 个基因组织特异性机器学习模型来证明该解决方案的可行性,这些模型在识别驾驶员和乘客突变方面优于实验性饱和诱变。模型及其对每个突变的评估被设计为可解释的,从而避免了黑盒预测设备。使用这些模型,我们勾勒出癌症基因潜在驱动突变的蓝图,并证明突变概率在塑造观察到的驱动突变景观中的作用。这些蓝图将支持对患者新测序肿瘤的解释以及癌症基因跨组织肿瘤发生机制的研究。

更新日期:2021-07-28
down
wechat
bug