当前位置: X-MOL 学术Protein Eng. Des. Sel. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accurate and efficient structure-based computational mutagenesis for modeling fluorescence levels of Aequorea victoria green fluorescent protein mutants.
Protein Engineering, Design and Selection ( IF 2.4 ) Pub Date : 2020-09-14 , DOI: 10.1093/protein/gzaa022
Majid Masso 1
Affiliation  

A computational mutagenesis technique was used to characterize the structural effects associated with over 46 000 single and multiple amino acid variants of Aequorea victoria green fluorescent protein (GFP), whose functional effects (fluorescence levels) were recently measured by experimental researchers. For each GFP mutant, the approach generated a single score reflecting the overall change in sequence-structure compatibility relative to native GFP, as well as a vector of environmental perturbation (EP) scores characterizing the impact at all GFP residue positions. A significant GFP structure–function relationship (P < 0.0001) was elucidated by comparing the sequence-structure compatibility scores with the functional data. Next, the computed vectors for GFP mutants were used to train predictive models of fluorescence by implementing random forest (RF) classification and tree regression machine learning algorithms. Classification performance reached 0.93 for sensitivity, 0.91 for precision and 0.90 for balanced accuracy, and regression models led to Pearson’s correlation as high as r = 0.83 between experimental and predicted GFP mutant fluorescence. An RF model trained on a subset of over 1000 experimental single residue GFP mutants with measured fluorescence was used for predicting the 3300 remaining unstudied single residue mutants, with results complementing known GFP biochemical and biophysical properties. In addition, models trained on the subset of experimental GFP mutants harboring multiple residue replacements successfully predicted fluorescence of the single residue GFP mutants. The models developed for this study were accurate and efficient, and their predictions outperformed those of several related state-of-the-art methods.

中文翻译:

准确和高效的基于结构的计算诱变,用于模拟维多利亚水母绿色荧光蛋白突变体的荧光水平。

使用计算机诱变技术来表征与维多利亚水母绿色荧光蛋白(GFP)的超过46 000个单个和多个氨基酸变体相关的结构效应,最近由实验研究人员测量了其功能效应(荧光水平)。对于每个GFP突变体,该方法均产生一个单一评分,反映相对于天然GFP的序列结构兼容性的整体变化,以及一个环境扰动(EP)评分载体,其特征是在所有GFP残基位置的影响。显着的GFP结构-功能关系(P 通过比较序列结构相容性评分与功能数据来阐明<0.0001)。接下来,通过实现随机森林(RF)分类和树回归机器学习算法,将GFP突变体的计算载体用于训练荧光的预测模型。分类性能在灵敏度上达到0.93,在精度上达到0.91,在平衡精度上达到0.90,回归模型导致Pearson的相关性高达r 在实验和预测的GFP突变体荧光之间= 0.83。在超过1000个具有测量荧光的实验性单残基GFP突变体的子集上训练的RF模型用于预测3300个尚待研究的单残基突变体,其结果与已知的GFP生化和生物物理特性互补。另外,在具有多个残基替换的实验性GFP突变体的子集上训练的模型成功地预测了单个残基GFP突变体的荧光。为该研究开发的模型准确有效,并且其预测优于几种相关最新技术的预测。
更新日期:2020-09-15
down
wechat
bug