当前位置: X-MOL 学术Genom. Proteom. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SuccSite: Incorporating Amino Acid Composition and Informative k-spaced Amino Acid Pairs to Identify Protein Succinylation Sites.
Genomics, Proteomics & Bioinformatics ( IF 11.5 ) Pub Date : 2020-06-24 , DOI: 10.1016/j.gpb.2018.10.010
Hui-Ju Kao , Van-Nui Nguyen , Kai-Yao Huang , Wen-Chi Chang , Tzong-Yi Lee

Protein succinylation is a biochemical reaction in which a succinyl group (-CO-CH2-CH2-CO-) is attached to the lysine residue of a protein molecule. Lysine succinylation plays important regulatory roles in living cells. However, studies in this field are limited by the difficulty in experimentally identifying the substrate site specificity of lysine succinylation. To facilitate this process, several tools have been proposed for the computational identification of succinylated lysine sites. In this study, we developed an approach to investigate the substrate specificity of lysine succinylated sites based on amino acid composition. Using experimentally verified lysine succinylated sites collected from public resources, the significant differences in position-specific amino acid composition between succinylated and non-succinylated sites were represented using the Two Sample Logo program. These findings enabled the adoption of an effective machine learning method, support vector machine, to train a predictive model with not only the amino acid composition, but also the composition of k-spaced amino acid pairs. After the selection of the best model using a ten-fold cross-validation approach, the selected model significantly outperformed existing tools based on an independent dataset manually extracted from published research articles. Finally, the selected model was used to develop a web-based tool, SuccSite, to aid the study of protein succinylation. Two proteins were used as case studies on the website to demonstrate the effective prediction of succinylation sites. We will regularly update SuccSite by integrating more experimental datasets. SuccSite is freely accessible at http://csb.cse.yzu.edu.tw/SuccSite/.



中文翻译:

SuccSite:结合氨基酸组成和信息性k间隔氨基酸对,以鉴定蛋白质琥珀酰化位点。

蛋白质琥珀酰化是一种生化反应,其中一个琥珀酰基(-CO-CH2-CH2-CO-)连接到蛋白质分子的赖氨酸残基上。赖氨酸琥珀酰化在活细胞中起重要的调节作用。然而,该领域的研究受限于通过实验确定赖氨酸琥珀酰化的底物位点特异性的困难。为了促进该过程,已经提出了几种用于琥珀酰赖氨酸位点的计算鉴定的工具。在这项研究中,我们开发了一种方法来研究基于氨基酸组成的赖氨酸琥珀酰化位点的底物特异性。使用从公共资源收集的经过实验验证的赖氨酸琥珀酰化位点,使用两个样品徽标程序表示了琥珀酰化和非琥珀酰化位点之间的位置特异性氨基酸组成的显着差异。这些发现使得能够采用有效的机器学习方法(支持向量机)来训练不仅具有氨基酸组成,而且具有k组成的预测模型。间隔的氨基酸对。使用十倍交叉验证方法选择最佳模型后,基于从已发表的研究文章中手动提取的独立数据集,所选模型的性能明显优于现有工具。最后,选择的模型用于开发基于Web的工具SuccSite,以辅助蛋白质琥珀酰化的研究。网站上使用两种蛋白质作为案例研究,以证明琥珀酰化位点的有效预测。我们将通过集成更多实验数据集来定期更新SuccSite。SuccSite可从http://csb.cse.yzu.edu.tw/SuccSite/免费访问。

更新日期:2020-06-24
down
wechat
bug