当前位置: X-MOL 学术J. Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ANuPP: A versatile tool to predict aggregation nucleating regions in peptides and proteins
Journal of Molecular Biology ( IF 5.6 ) Pub Date : 2020-11-12 , DOI: 10.1016/j.jmb.2020.11.006
R Prabakaran 1 , Puneet Rawat 1 , Sandeep Kumar 2 , M Michael Gromiha 3
Affiliation  

Short aggregation prone sequence motifs can trigger aggregation in peptide and protein sequences. Most algorithms developed so far to identify potential aggregation prone regions (APRs) use amino acid residue composition and/or sequence pattern features. In this work, we have investigated the importance of atomic-level characteristics rather than residue level to understand the initiation of aggregation in proteins and peptides. Using atomic-level features an ensemble-classifier, ANuPP has been developed to predict the aggregation-nucleating regions in peptides and proteins. In a dataset of 1279 hexapeptides, ANuPP achieved an area under the curve (AUC) of 0.831 with 77% accuracy on 10-fold cross-validation and an AUC of 0.883 with 83% accuracy in a blind test dataset of 142 hexapeptides. Further, it showed an average SOV of 48.7% on identifying APR regions in 37 proteins. The performance of ANuPP is better than other methods reported in the literature on both amyloidogenic hexapeptide prediction and APR identification. We have developed a web server for ANuPP and it is available at https://web.iitm.ac.in/bioinfo2/ANuPP/. Insights gained from this work demonstrate the importance of atomic and functional group characteristics towards diversity of atomic level origins as well as mechanisms of protein aggregation.



中文翻译:

ANuPP:预测肽和蛋白质中聚集成核区域的多功能工具

容易聚集的短序列基序可以触发肽和蛋白质序列中的聚集。迄今为止开发的用于识别潜在聚集易发区域 (APR) 的大多数算法都使用氨基酸残基组成和/或序列模式特征。在这项工作中,我们研究了原子水平特征而不是残基水平的重要性,以了解蛋白质和肽中聚集的起始。使用原子级特征和集成分类器,ANuPP 已被开发用于预测肽和蛋白质中的聚集成核区域。在 1279 个六肽的数据集中,ANuPP 的曲线下面积 (AUC) 为 0.831,10 倍交叉验证的准确率为 77%,而在 142 个六肽的盲测数据集中,AUC 为 0.883,准确率为 83%。此外,它显示平均 SOV 为 48。7% 在 37 种蛋白质中识别 APR 区域。在淀粉样蛋白生成六肽预测和 APR 鉴定方面,ANuPP 的性能优于文献中报道的其他方法。我们为 ANuPP 开发了一个 web 服务器,它可以在https://web.iitm.ac.in/bioinfo2/ANuPP/ 从这项工作中获得的见解证明了原子和官能团特征对原子水平起源的多样性以及蛋白质聚集机制的重要性。

更新日期:2020-11-12
down
wechat
bug