当前位置: X-MOL 学术Chem. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
Chemical Science ( IF 7.6 ) Pub Date : 2020/03/03 , DOI: 10.1039/c9sc06561j
Jie Li 1, 2 , Kochise C Bennett 1, 2 , Yuchen Liu 1, 2 , Michael V Martin 3 , Teresa Head-Gordon 1, 2, 3, 4
Affiliation  

Here we report a new machine learning algorithm for protein chemical shift prediction that outperforms existing chemical shift calculators on realistic data that is not heavily curated, nor eliminates test predictions ad hoc. Our UCBShift predictor implements two modules: a transfer prediction module that employs both sequence and structural alignment to select reference candidates for experimental chemical shift replication, and a redesigned machine learning module based on random forest regression which utilizes more, and more carefully curated, feature extracted data. When combined together, this new predictor achieves state-of-the-art accuracy for predicting chemical shifts on a randomly selected dataset without careful curation, with root-mean-square errors of 0.31 ppm for amide hydrogens, 0.19 ppm for Hα, 0.84 ppm for C′, 0.81 ppm for Cα, 1.00 ppm for Cβ, and 1.81 ppm for N. When similar sequences or structurally related proteins are available, UCBShift shows superior native state selection from misfolded decoy sets compared to SPARTA+ and SHIFTX2, and even without homology we exceed current prediction accuracy of all other popular chemical shift predictors.

中文翻译:

根据“真实世界”数据准确预测水性蛋白质结构的化学位移

在这里,我们报告了一种用于蛋白质化学位移预测的新机器学习算法,该算法在未经严格策划的实际数据上优于现有的化学位移计算器,也没有消除临时测试预测。我们的 UCBShift 预测器实现了两个模块:一个转移预测模块,它采用序列和结构比对来选择用于实验化学位移复制的参考候选者;以及一个基于随机森林回归的重新设计的机器学习模块,该模块利用更多、更精心策划的特征提取数据。当组合在一起时,这种新的预测器无需仔细管理即可在随机选择的数据集上预测化学位移,从而达到最先进的精度,酰胺氢的均方根误差为 0.31 ppm,Hα 为 0.19 ppm,Hα 为 0.84 ppm C' 为 0.81 ppm,Cα 为 0.81 ppm,Cβ 为 1.00 ppm,N 为 1.81 ppm。当有相似的序列或结构相关的蛋白质可用时,UCBShift 与 SPARTA+ 和 SHIFTX2 相比,即使没有同源性,也能从错误折叠的诱饵集中显示出优异的天然状态选择我们超过了所有其他流行的化学位移预测器的当前预测精度。
更新日期:2020-03-26
down
wechat
bug