当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using deep neural networks and biological subwords to detect protein S-sulfenylation sites.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2020-07-02 , DOI: 10.1093/bib/bbaa128
Duyen Thi Do 1 , Thanh Quynh Trang Le 2 , Nguyen Quoc Khanh Le 3
Affiliation  

Protein S-sulfenylation is one kind of crucial post-translational modifications (PTMs) in which the hydroxyl group covalently binds to the thiol of cysteine. Some recent studies have shown that this modification plays an important role in signaling transduction, transcriptional regulation and apoptosis. To date, the dynamic of sulfenic acids in proteins remains unclear because of its fleeting nature. Identifying S-sulfenylation sites, therefore, could be the key to decipher its mysterious structures and functions, which are important in cell biology and diseases. However, due to the lack of effective methods, scientists in this field tend to be limited in merely a handful of some wet lab techniques that are time-consuming and not cost-effective. Thus, this motivated us to develop an in silico model for detecting S-sulfenylation sites only from protein sequence information. In this study, protein sequences served as natural language sentences comprising biological subwords. The deep neural network was consequentially employed to perform classification. The performance statistics within the independent dataset including sensitivity, specificity, accuracy, Matthews correlation coefficient and area under the curve rates achieved 85.71%, 69.47%, 77.09%, 0.5554 and 0.833, respectively. Our results suggested that the proposed method (fastSulf-DNN) achieved excellent performance in predicting S-sulfenylation sites compared to other well-known tools on a benchmark dataset.

中文翻译:

使用深度神经网络和生物子词检测蛋白质 S-磺基化位点。

蛋白质 S-磺基化是一种关键的翻译后修饰 (PTM),其中羟基与半胱氨酸的硫醇共价结合。最近的一些研究表明,这种修饰在信号转导、转录调控和细胞凋亡中起着重要作用。迄今为止,蛋白质中次磺酸的动态仍不清楚,因为其转瞬即逝。因此,识别 S-磺基化位点可能是破译其神秘结构和功能的关键,这在细胞生物学和疾病中很重要。然而,由于缺乏有效的方法,该领域的科学家往往仅限于少数几种耗时且不具有成本效益的湿实验室技术。因此,这促使我们开发了一个in silico仅从蛋白质序列信息中检测 S-亚磺酰化位点的模型。在这项研究中,蛋白质序列用作包含生物子词的自然语言句子。因此,深度神经网络被用于执行分类。独立数据集内的性能统计数据包括敏感性、特异性、准确性、Matthews 相关系数和曲线下面积率分别达到 85.71%、69.47%、77.09%、0.5554 和 0.833。我们的结果表明,与基准数据集上的其他知名工具相比,所提出的方法 (fastSulf-DNN) 在预测 S-亚磺酰化位点方面取得了出色的性能。
更新日期:2020-07-02
down
wechat
bug