当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying polyadenylation signals with biological embedding via self-attentive gated convolutional highway networks
Applied Soft Computing ( IF 7.2 ) Pub Date : 2021-01-22 , DOI: 10.1016/j.asoc.2021.107133
Yanbu Guo , Dongming Zhou , Weihua Li , Jinde Cao , Rencan Nie , Lei Xiong , Xiaoli Ruan

Polyadenylation is responsible for regulating mRNA function and translation efficiency. The accurate polyadenylation signal (PAS) recognition is essential for understanding the genome annotation and function. Currently, computational methods mainly rely on human-designed features and do not extract complex discriminative patterns. In this work, we first adopt a shallow neural network to automatically learn biological embedding information with discriminative patterns from biological sequences alone. Next, we devise a hybrid deep learning framework termed PASNet to automatically extract underlying patterns by integrating gated convolutional highway networks with a self-attention mechanism and then identify PAS from genomic sequences. Specifically, we devise a gated convolutional highway unit by gated convolutional mechanisms, and the highway unit has significant paths to allow unimpeded information to pass between two adjacent layers. PASNet requires little prior knowledge and no laborious feature engineering. Besides, we evaluate PASNet on four different organism datasets through genome-wide experiments. Compared to published machine learning and deep learning predictors, results show that our framework achieves the improvement with the error rate decreasing by 4% and 1.02% respectively on average.



中文翻译:

通过自注意门控卷积高速公路网络生物嵌入识别多腺苷酸化信号

聚腺苷酸负责调节mRNA功能和翻译效率。准确的聚腺苷酸化信号(PAS)识别对于理解基因组注释和功能至关重要。当前,计算方法主要依靠人为设计的特征,并且不提取复杂的判别模式。在这项工作中,我们首先采用浅层神经网络来自动从单独的生物序列中学习具有判别模式的生物嵌入信息。接下来,我们设计了一个混合式深度学习框架,称为PASNet,它通过将门控卷积高速公路网络与一种自注意机制集成,然后从基因组序列中识别PAS,来自动提取基础模式。具体来说,我们通过门控卷积机制设计了门控卷积公路单元,高速公路单元具有明显的路径,以允许无阻碍的信息在两个相邻层之间传递。PASNet几乎不需要先验知识,也不需要费力的功能工程。此外,我们通过全基因组实验在四个不同生物数据集上评估了PASNet。与已发布的机器学习和深度学习预测器相比,结果表明我们的框架实现了改进,错误率平均分别降低了4%和1.02%。

更新日期:2021-02-01
down
wechat
bug