当前位置: X-MOL 学术Proteins Struct. Funct. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14
Proteins: Structure, Function, and Bioinformatics ( IF 2.9 ) Pub Date : 2021-07-30 , DOI: 10.1002/prot.26193
Wei Zheng 1 , Yang Li 1, 2 , Chengxin Zhang 1 , Xiaogen Zhou 1 , Robin Pearce 1 , Eric W. Bell 1 , Xiaoqiang Huang 1 , Yang Zhang 1, 3
Affiliation  

In this article, we report 3D structure prediction results by two of our best server groups (“Zhang-Server” and “QUARK”) in CASP14. These two servers were built based on the D-I-TASSER and D-QUARK algorithms, which integrated four newly developed components into the classical protein folding pipelines, I-TASSER and QUARK, respectively. The new components include: (a) a new multiple sequence alignment (MSA) collection tool, DeepMSA2, which is extended from the DeepMSA program; (b) a contact-based domain boundary prediction algorithm, FUpred, to detect protein domain boundaries; (c) a residual convolutional neural network-based method, DeepPotential, to predict multiple spatial restraints by co-evolutionary features derived from the MSA; and (d) optimized spatial restraint energy potentials to guide the structure assembly simulations. For 37 FM targets, the average TM-scores of the first models produced by D-I-TASSER and D-QUARK were 96% and 112% higher than those constructed by I-TASSER and QUARK, respectively. The data analysis indicates noticeable improvements produced by each of the four new components, especially for the newly added spatial restraints from DeepPotential and the well-tuned force field that combines spatial restraints, threading templates, and generic knowledge-based potentials. However, challenges still exist in the current pipelines. These include difficulties in modeling multi-domain proteins due to low accuracy in inter-domain distance prediction and modeling protein domains from oligomer complexes, as the co-evolutionary analysis cannot distinguish inter-chain and intra-chain distances. Specifically tuning the deep learning-based predictors for multi-domain targets and protein complexes may be helpful to address these issues.

中文翻译:

在 CASP14 中使用深度学习距离和氢键约束进行蛋白质结构预测

在本文中,我们报告了 CASP14 中两个最佳服务器组(“Zhang-Server”和“QUARK”)的 3D 结构预测结果。这两个服务器是基于 DI-TASSER 和 D-QUARK 算法构建的,它们分别将四个新开发的组件集成到经典的蛋白质折叠管道中,I-TASSER 和 QUARK。新组件包括: (a) 一个新的多序列比对 (MSA) 收集工具 DeepMSA2,它是从 DeepMSA 程序扩展而来的;(b) 基于接触的域边界预测算法 FUpred,用于检测蛋白质域边界;(c) 基于残差卷积神经网络的方法 DeepPotential,通过源自 MSA 的协同进化特征预测多个空间约束;(d) 优化空间约束能势以指导结构装配模拟。对于 37 个 FM 目标,DI-TASSER 和 D-QUARK 生成的第一个模型的平均 TM 分数分别比 I-TASSER 和 QUARK 构建的模型高 96% 和 112%。数据分析表明,四个新组件中的每一个都产生了显着的改进,特别是对于来自 DeepPotential 的新添加的空间约束和结合了空间约束、线程模板和基于通用知识的潜力的良好调整的力场。然而,目前的管道仍然存在挑战。这些包括由于域间距离预测和从寡聚体复合物建模蛋白质域的准确性低而难以对多域蛋白质建模,因为协同进化分析无法区分链间和链内距离。
更新日期:2021-08-09
down
wechat
bug