当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures
Journal of Bioinformatics and Computational Biology ( IF 1 ) Pub Date : 2021-06-11 , DOI: 10.1142/s0219720021500153
Wajid Arshad Abbasi 1 , Syed Ali Abbas 1 , Saiqa Andleeb 2
Affiliation  

Accurately determining a change in protein binding affinity upon mutations is important to find novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be supported with computational methods. Most of the available computational prediction techniques depend upon protein structures that bound their applicability to only protein complexes with recognized 3D structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation and question the effectiveness of K-fold cross-validation (CV) across mutations adopted in previous studies to assess the generalization ability of such predictors with no known mutation during training. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA performs comparably to the existing methods gauged through an appropriate CV scheme and an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. We made PANDA easily accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda, respectively.

中文翻译:

熊猫:通过在一级结构中发现信号来预测突变后蛋白质结合亲和力的变化

准确确定突变后蛋白质结合亲和力的变化对于寻找新的治疗方法和协助诱变研究很重要。确定突变后结合亲和力的变化需要复杂、昂贵且耗时的湿实验室实验,而计算方法可以支持这些实验。大多数可用的计算预测技术都依赖于蛋白质结构,这些蛋白质结构将它们的适用性仅限于具有公认的 3D 结构的蛋白质复合物。在这项工作中,我们探索了基于序列的对突变后蛋白质结合亲和力变化的预测,并质疑ķ- 先前研究中采用的跨突变的折叠交叉验证(CV),以评估在训练期间没有已知突变的此类预测因子的泛化能力。我们使用蛋白质序列信息而不是蛋白质结构以及机器学习技术来准确预测突变后蛋白质结合亲和力的变化。我们提出的基于序列的蛋白质结合亲和力预测器的新变化称为 PANDA,其性能与通过适当的 CV 方案和外部独立测试数据集衡量的现有方法相当。在外部测试数据集上,我们提出的方法给出的最大 Pearson 相关系数为 0.52,而最先进的现有基于蛋白质结构的方法 MutaBind 给出的最大 Pearson 相关系数为 0.59。我们提出的基于蛋白质序列的方法,用于预测突变后结合亲和力的变化,与现有的基于蛋白质结构的方法相比,具有广泛的适用性和相当的性能。我们通过基于云的网络服务器和 python 代码使 PANDA 易于访问https://sites.google.com/view/wajidarshad/softwarehttps://github.com/wajidarshad/panda, 分别。
更新日期:2021-06-11
down
wechat
bug