当前位置: X-MOL 学术Mol. Biosyst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CarSite: identifying carbonylated sites of human proteins based on a one-sided selection resampling method
Molecular BioSystems Pub Date : 2017-08-31 00:00:00 , DOI: 10.1039/c7mb00363c
Yun Zuo 1, 2, 3, 4 , Cang-Zhi Jia 1, 2, 3, 4
Affiliation  

Protein carbonylation is one of the most important biomarkers of oxidative protein damage and such protein damage is linked to various diseases and aging. It is thus vital that carbonylation sites are identified accurately. In this study, CarSite, a novel bioinformatics tool, was established to identify carbonylation sites in human proteins. The one-sided selection (OSS) resampling method was used to establish balanced training datasets and this resampling method is demonstrated to perform better than a Monte Carlo resampling method via 10-fold cross-validation tests on the Jia dataset. Moreover, the hybrid combination of position-specific amino acid propensity (PSAAP), composition of k-spaced amino acid pairs (CKSAAP), amino acid composition (AAC), and composition of hydrophobic and hydrophilic amino acids (CHHAA) was selected to optimize the performance of the predictor. On 10-fold cross-validation of the Jia dataset, CarSite obtained rates of sensitivity corresponding to K/P/R/T-type peptides of ∼21%, 22%, 19%, or 18% higher than those obtained by iCar-PseCp, respectively, which was previously considered as the best predictor for identifying carbonylation sites in human proteins. Furthermore, compared with other existing predictors, CarSite obtained much higher sensitivity and accuracy when tested on the same dataset.

中文翻译:

CarSite:基于单面选择重采样方法识别人蛋白质的羰基化位点

蛋白质羰基化是氧化性蛋白质损伤的最重要的生物标记之一,这种蛋白质损伤与各种疾病和衰老有关。因此,至关重要的是要准确识别羰基化位点。在这项研究中,建立了一种新型的生物信息学工具CarSite来鉴定人蛋白质中的羰基化位点。通过单方选择(OSS)重采样方法来建立平衡的训练数据集,并且通过对Jia数据集进行10倍交叉验证测试,证明了该重采样方法比Monte Carlo重采样方法具有更好的性能。此外,位置特异性氨基酸倾向(PSAAP),k组成的杂合组合选择间隔开的氨基酸对(CKSAAP),氨基酸组成(AAC)以及疏水和亲水氨基酸的组成(CHHAA)以优化预测器的性能。在对Jia数据集进行10倍交叉验证后,CarSite获得的对应于K / P / R / T型肽的敏感性比率比iCar-获得的敏感性高约21%,22%,19%或18%。 PseCp分别以前被认为是鉴定人蛋白质中羰基化位点的最佳预测因子。此外,与其他现有预测变量相比,在同一数据集上进行测试时,CarSite获得了更高的灵敏度和准确性。
更新日期:2017-10-25
down
wechat
bug