当前位置: X-MOL 学术Comput. Math. Method Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Succinylation Site Prediction Based on Protein Sequences Using the IFS-LightGBM (BO) Model
Computational and Mathematical Methods in Medicine ( IF 2.809 ) Pub Date : 2020-11-11 , DOI: 10.1155/2020/8858489
Lu Zhang 1 , Min Liu 1 , Xinyi Qin 1 , Guangzhong Liu 1
Affiliation  

Succinylation is an important posttranslational modification of proteins, which plays a key role in protein conformation regulation and cellular function control. Many studies have shown that succinylation modification on protein lysine residue is closely related to the occurrence of many diseases. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. In this study, we develop a new model, IFS-LightGBM (BO), which utilizes the incremental feature selection (IFS) method, the LightGBM feature selection method, the Bayesian optimization algorithm, and the LightGBM classifier, to predict succinylation sites in proteins. Specifically, pseudo amino acid composition (PseAAC), position-specific scoring matrix (PSSM), disorder status, and Composition of -spaced Amino Acid Pairs (CKSAAP) are firstly employed to extract feature information. Then, utilizing the combination of the LightGBM feature selection method and the incremental feature selection (IFS) method selects the optimal feature subset for the LightGBM classifier. Finally, to increase prediction accuracy and reduce the computation load, the Bayesian optimization algorithm is used to optimize the parameters of the LightGBM classifier. The results reveal that the IFS-LightGBM (BO)-based prediction model performs better when it is evaluated by some common metrics, such as accuracy, recall, precision, Matthews Correlation Coefficient (MCC), and -measure.

中文翻译:

使用 IFS-LightGBM (BO) 模型进行基于蛋白质序列的琥珀酰化位点预测

琥珀酰化是蛋白质重要的翻译后修饰,在蛋白质构象调控和细胞功能控制中发挥着关键作用。许多研究表明,蛋白质赖氨酸残基的琥珀酰化修饰与多种疾病的发生密切相关。为了深入了解琥珀酰化的机制,需要准确识别蛋白质中的琥珀酰化位点。在本研究中,我们开发了一种新模型 IFS-LightGBM (BO),它利用增量特征选择 (IFS) 方法、LightGBM 特征选择方法、贝叶斯优化算法和 LightGBM 分类器来预测蛋白质中的琥珀酰化位点。具体来说,首先采用伪氨基酸组成(PseAAC)、位置特异性评分矩阵(PSSM)、紊乱状态和间隔氨基酸对组成 CKSAAP)来提取特征信息。然后,利用LightGBM特征选择方法和增量特征选择(IFS)方法的组合为LightGBM分类器选择最佳特征子集。最后,为了提高预测精度并减少计算量,采用贝叶斯优化算法对LightGBM分类器的参数进行优化。结果表明,基于 IFS-LightGBM (BO) 的预测模型在通过一些常用指标(如准确率、召回率、精确率、马修斯相关系数 (MCC) 和测量)进行评估时表现更好
更新日期:2020-11-12
down
wechat
bug