当前位置: X-MOL 学术Proteins Struct. Funct. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Structure–sequence features based prediction of phosphosites of serine/threonine protein kinases of Mycobacterium tuberculosis
Proteins: Structure, Function, and Bioinformatics ( IF 3.2 ) Pub Date : 2021-07-30 , DOI: 10.1002/prot.26195
Vipul V. Nilkanth 1 , Shekhar C. Mande 2
Affiliation  

Elucidation of signaling events in a pathogen is potentially important to tackle the infection caused by it. Such events mediated by protein phosphorylation play important roles in infection, and therefore, to predict the phosphosites and substrates of the serine/threonine protein kinases, we have developed a Machine learning-based approach for Mycobacterium tuberculosis serine/threonine protein kinases using kinase-peptide structure–sequence data. This approach utilizes features derived from kinase three-dimensional-structure environment and known phosphosite sequences to generate support vector machine (SVM)-based kinase-specific predictions of phosphosites of serine/threonine protein kinases (STPKs) with no or scarce data of their substrates. SVM outperformed the four machine learning algorithms we tried (random forest, logistic regression, SVM, and k-nearest neighbors) with an area under the curve receiver-operating characteristic value of 0.88 on the independent testing dataset and a 10-fold cross-validation accuracy of ~81.6% for the final model. Our predicted phosphosites of M. tuberculosis STPKs form a useful resource for experimental biologists enabling elucidation of STPK mediated posttranslational regulation of important cellular processes.

中文翻译:

基于结构序列特征的结核分枝杆菌丝氨酸/苏氨酸蛋白激酶磷酸位点预测

阐明病原体中的信号事件对于解决由病原体引起的感染可能很重要。由蛋白质磷酸化介导的此类事件在感染中发挥重要作用,因此,为了预测丝氨酸/苏氨酸蛋白激酶的磷酸位点和底物,我们开发了一种基于机器学习的结核分枝杆菌方法使用激酶 - 肽结构 - 序列数据的丝氨酸/苏氨酸蛋白激酶。这种方法利用源自激酶三维结构环境和已知磷酸位点序列的特征来生成基于支持向量机 (SVM) 的丝氨酸/苏氨酸蛋白激酶 (STPKs) 磷酸位点的激酶特异性预测,没有或缺乏其底物数据. SVM 的性能优于我们尝试的四种机器学习算法(随机森林、逻辑回归、SVM 和 k 最近邻),其曲线下面积接收器操作特征值在独立测试数据集上为 0.88,并进行了 10 倍交叉验证最终模型的准确率约为 81.6%。我们预测的结核分枝杆菌磷位点 STPK 为实验生物学家提供了有用的资源,可以阐明 STPK 介导的重要细胞过程的翻译后调节。
更新日期:2021-07-30
down
wechat
bug