当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A deep attention network for predicting amino acid signals in the formation of α-helices
Journal of Bioinformatics and Computational Biology ( IF 1 ) Pub Date : 2020-06-18 , DOI: 10.1142/s0219720020500286
A Visibelli 1 , P Bongini 2, 3 , A Rossi 2, 3 , N Niccolai 1 , M Bianchini 2
Affiliation  

The secondary and tertiary structure of a protein has a primary role in determining its function. Even though many folding prediction algorithms have been developed in the past decades — mainly based on the assumption that folding instructions are encoded within the protein sequence — experimental techniques remain the most reliable to establish protein structures. In this paper, we searched for signals related to the formation of [Formula: see text]-helices. We carried out a statistical analysis on a large dataset of experimentally characterized secondary structure elements to find over- or under-occurrences of specific amino acids defining the boundaries of helical moieties. To validate our hypothesis, we trained various Machine Learning models, each equipped with an attention mechanism, to predict the occurrence of [Formula: see text]-helices. The attention mechanism allows to interpret the model’s decision, weighing the importance the predictor gives to each part of the input. The experimental results show that different models focus on the same subsequences, which can be seen as codes driving the secondary structure formation.

中文翻译:

用于预测α-螺旋形成中氨基酸信号的深度注意力网络

蛋白质的二级和三级结构在决定其功能方面起主要作用。尽管在过去的几十年中已经开发了许多折叠预测算法——主要基于折叠指令编码在蛋白质序列中的假设——实验技术仍然是建立蛋白质结构的最可靠的方法。在本文中,我们搜索了与[公式:见正文]-螺旋形成相关的信号。我们对实验表征的二级结构元素的大型数据集进行了统计分析,以发现定义螺旋部分边界的特定氨基酸的过度出现或不足出现。为了验证我们的假设,我们训练了各种机器学习模型,每个模型都配备了注意力机制,以预测 [公式:见文本]-螺旋的出现。注意力机制允许解释模型的决定,权衡预测器对输入每个部分的重要性。实验结果表明,不同的模型关注相同的子序列,可以看作是驱动二级结构形成的代码。
更新日期:2020-06-18
down
wechat
bug