当前位置: X-MOL 学术Mol. Omics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RF-GlutarySite: a random forest based predictor for glutarylation sites.
Molecular Omics ( IF 2.9 ) Pub Date : 2019-04-26 , DOI: 10.1039/c9mo00028c
Hussam J Al-Barakati 1 , Hiroto Saigo 2 , Robert H Newman 3 , Dukka B Kc 1
Affiliation  

Glutarylation, which is a newly identified posttranslational modification that occurs on lysine residues, has recently emerged as an important regulator of several metabolic and mitochondrial processes. However, the specific sites of modification on individual proteins, as well as the extent of glutarylation throughout the proteome, remain largely uncharacterized. Though informative, proteomic approaches based on mass spectrometry can be expensive, technically challenging and time-consuming. Therefore, the ability to predict glutarylation sites from protein primary sequences can complement proteomics analyses and help researchers study the characteristics and functional consequences of glutarylation. To this end, we used Random Forest (RF) machine learning strategies to identify the physiochemical and sequence-based features that correlated most substantially with glutarylation. We then used these features to develop a novel method to predict glutarylation sites from primary amino acid sequences using RF. Based on 10-fold cross-validation, the resulting algorithm, termed 'RF-GlutarySite', achieved efficiency scores of 75%, 81%, 68% and 0.50 with respect to accuracy (ACC), sensitivity (SN), specificity (SP) and Matthew's correlation coefficient (MCC), respectively. Likewise, using an independent test set, RF-GlutarySite exhibited ACC, SN, SP and MCC scores of 72%, 73%, 70% and 0.43, respectively. Results using both 10-fold cross validation and an independent test set were on par with or better than those achieved by existing glutarylation site predictors. Notably, RF-GlutarySite achieved the highest SN score among available glutarylation site prediction tools. Consequently, our method has the potential to uncover new glutarylation sites and to facilitate the discovery of relationships between glutarylation and well-known lysine modifications, such as acetylation, methylation and SUMOylation, as well as a number of recently identified lysine modifications, such as malonylation and succinylation.

中文翻译:

RF-GlutarySite:基于随机森林的戊二酸位点预测器。

谷氨酸化是赖氨酸残基上发生的新发现的翻译后修饰,最近已成为一些代谢和线粒体过程的重要调节剂。但是,单个蛋白质上修饰的特定位点以及整个蛋白质组中的戊二酸化程度仍未完全表征。尽管基于质谱的信息学,蛋白质组学方法可能是昂贵的,技术上具有挑战性且耗时的。因此,从蛋白质一级序列预测戊二酸位点的能力可以补充蛋白质组学分析,并有助于研究人员研究戊二酸的特征和功能后果。为此,我们使用随机森林(RF)机器学习策略来识别与戊二酸最相关的物理化学和基于序列的特征。然后,我们使用这些功能开发了一种新方法,可以使用RF从一级氨基酸序列预测戊二酸位点。基于10倍交叉验证,所得算法称为“ RF-GlutarySite”,相对于准确性(ACC),灵敏度(SN),特异性(SP),效率得分分别为75%,81%,68%和0.50 )和Matthew的相关系数(MCC)。同样,使用独立的测试集,RF-GlutarySite的ACC,SN,SP和MCC得分分别为72%,73%,70%和0.43。使用10倍交叉验证和独立测试集的结果与现有的戊二酰化位点预测因子所获得的结果相同或更好。值得注意的是,RF-GlutarySite在可用的戊二酸化位点预测工具中获得了最高的SN分数。因此,我们的方法有可能发现新的戊二酸位点,并促进发现戊二酸与众所周知的赖氨酸修饰(例如乙酰化,甲基化和SUMO化)之间的关系,以及最近发现的许多赖氨酸修饰(例如丙二酸化)之间的关系。和琥珀酰化。
更新日期:2019-06-11
down
wechat
bug