当前位置: X-MOL 学术J. Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Decomposing Structural Response Due to Sequence Changes in Protein Domains with Machine Learning.
Journal of Molecular Biology ( IF 4.7 ) Pub Date : 2020-05-30 , DOI: 10.1016/j.jmb.2020.05.021
Patrick Bryant 1 , Arne Elofsson 1
Affiliation  

How protein domain structure changes in response to mutations is not well understood. Some mutations change the structure drastically, while most only result in small changes. To gain an understanding of this, we decompose the relationship between changes in domain sequence and structure using machine learning. We select pairs of evolutionarily related domains with a broad range of evolutionary distances. In contrast to earlier studies, we do not find a strictly linear relationship between sequence and structural changes. We train a random forest regressor that predicts the structural similarity between pairs with an average accuracy of 0.029 lDDT ( local Distance Difference Test) score, and a correlation coefficient of 0.92. Decomposing the feature importance shows that the domain length, or analogously, size is the most important feature. Our model enables assessing deviations in relative structural response, and thus prediction of evolutionary trajectories, in protein domains across evolution.



中文翻译:

通过机器学习分解由于蛋白质域中序列变化引起的结构响应。

蛋白质结构域结构如何响应突变尚不十分清楚。一些突变会彻底改变结构,而大多数只会导致很小的改变。为了对此有所了解,我们使用机器学习分解了域序列和结构变化之间的关系。我们选择具有广泛进化距离的进化对域。与早期的研究相反,我们在序列和结构变化之间没有发现严格的线性关系。我们训练了一个随机森林回归器,该回归器预测对之间的结构相似性,其平均准确度为0.029 lDDT(局部距离差异测试)得分,相关系数为0.92。分解特征重要性显示域长度或类似大小是最重要的特征。

更新日期:2020-07-24
down
wechat
bug