当前位置: X-MOL 学术Comb. Chem. High Throughput Screen. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Antioxidant Proteins' Identification Based on Support Vector Machine.
Combinatorial Chemistry & High Throughput Screening ( IF 1.6 ) Pub Date : 2020-05-01 , DOI: 10.2174/1386207323666200306125538
Yuanke Xu 1 , Yaping Wen 1 , Guosheng Han 1
Affiliation  

Background: Evidence have increasingly indicated that for human disease, cell metabolism are deeply associated with proteins. Structural mutations and dysregulations of these proteins contribute to the development of the complex disease. Free radicals are unstable molecules that seek for electrons from the surrounding atoms for stability. Once a free radical binds to an atom in the body, a chain reaction occurs, which causes damage to cells and DNA. An antioxidant protein is a substance that protects cells from free radical damage. Accurate identification of antioxidant proteins is important for understanding their role in delaying aging and preventing and treating related diseases. Therefore, computational methods to identify antioxidant proteins have become an effective prior-pinpointing approach to experimental verification.

Methods: In this study, support vector machines was used to identify antioxidant proteins, using amino acid compositions and 9-gap dipeptide compositions as feature extraction, and feature reduction by Principal Component Analysis.

Results: The prediction accuracy Acc of this experiment reached 98.38%, the recall rate Sn of the positive sample was found to be 99.27%, the recall rate Sp of the negative sample reached 97.54%, and the MCC value was 0.9678. To evaluate our proposed method, the predictive performance of 20 antioxidant proteins from the National Center for Biotechnology Information(NCBI) was studied. As a result, 20 antioxidant proteins were correctly predicted by our method. Experimental results demonstrate that the performance of our method is better than the state-of-the-art methods for identification of antioxidant proteins.

Conclusion: We collected experimental protein data from Uniport, including 253 antioxidant proteins and 1552 non-antioxidant proteins. The optimal feature extraction used in this paper is composed of amino acid composition and 9-gap dipeptide. The protein is identified by support vector machine, and the model evaluation index is obtained based on 5-fold cross-validation. Compared with the existing classification model, it is further explained that the SVM recognition model constructed in this paper is helpful for the recognition of antioxidized proteins.



中文翻译:

基于支持向量机的抗氧化蛋白识别

背景:越来越多的证据表明,对于人类疾病,细胞代谢与蛋白质有着密切的联系。这些蛋白质的结构突变和失调促进了复杂疾病的发展。自由基是不稳定的分子,它们从周围的原子中寻求电子来保持稳定性。一旦自由基与体内原子结合,就会发生连锁反应,从而对细胞和DNA造成损害。抗氧化剂蛋白质是一种保护细胞免受自由基损害的物质。正确鉴定抗氧化剂蛋白对于理解其在延缓衰老,预防和治疗相关疾病中的作用非常重要。因此,鉴定抗氧化剂蛋白质的计算方法已成为进行实验验证的有效先验方法。

方法:在这项研究中,使用支持向量机识别抗氧化剂蛋白,使用氨基酸成分和9间隙二肽成分进行特征提取,并通过主成分分析进行特征约简。

结果:该实验的预测准确度Acc达到98.38%,阳性样本的召回率Sn为99.27%,阴性样本的召回率Sp达到97.54%,MCC值为0.9678。为了评估我们提出的方法,研究了国家生物技术信息中心(NCBI)提供的20种抗氧化剂蛋白的预测性能。结果,通过我们的方法正确预测了20种抗氧化剂蛋白。实验结果表明,我们的方法的性能优于抗氧化剂蛋白质鉴定的最新方法。

结论:我们从Uniport收集了实验性蛋白质数据,包括253种抗氧化蛋白和1552种非抗氧化蛋白。本文采用的最佳特征提取方法是由氨基酸组成和9间隙二肽组成。通过支持向量机对蛋白质进行鉴定,并基于5倍交叉验证获得模型评价指标。与现有的分类模型相比,进一步解释了本文构建的支持向量机识别模型有助于抗氧化蛋白的识别。

更新日期:2020-05-01
down
wechat
bug