当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DeepATT: a hybrid category attention neural network for identifying functional effects of DNA sequences.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2021-05-20 , DOI: 10.1093/bib/bbaa159
Jiawei Li 1 , Yuqian Pu 1 , Jijun Tang 2 , Quan Zou 3 , Fei Guo 1
Affiliation  

Quantifying DNA properties is a challenging task in the broad field of human genomics. Since the vast majority of non-coding DNA is still poorly understood in terms of function, this task is particularly important to have enormous benefit for biology research. Various DNA sequences should have a great variety of representations, and specific functions may focus on corresponding features in the front part of learning model. Currently, however, for multi-class prediction of non-coding DNA regulatory functions, most powerful predictive models do not have appropriate feature extraction and selection approaches for specific functional effects, so that it is difficult to gain a better insight into their internal correlations. Hence, we design a category attention layer and category dense layer in order to select efficient features and distinguish different DNA functions. In this study, we propose a hybrid deep neural network method, called DeepATT, for identifying $919$ regulatory functions on nearly $5$ million DNA sequences. Our model has four built-in neural network constructions: convolution layer captures regulatory motifs, recurrent layer captures a regulatory grammar, category attention layer selects corresponding valid features for different functions and category dense layer classifies predictive labels with selected features of regulatory functions. Importantly, we compare our novel method, DeepATT, with existing outstanding prediction tools, DeepSEA and DanQ. DeepATT performs significantly better than other existing tools for identifying DNA functions, at least increasing $1.6\%$ area under precision recall. Furthermore, we can mine the important correlation among different DNA functions according to the category attention module. Moreover, our novel model can greatly reduce the number of parameters by the mechanism of attention and locally connected, on the basis of ensuring accuracy.

中文翻译:

DeepATT:一种用于识别 DNA 序列功能效应的混合类别注意力神经网络。

在人类基因组学的广泛领域中,量化 DNA 特性是一项具有挑战性的任务。由于绝大多数非编码 DNA 在功能方面仍然知之甚少,因此这项任务对于生物学研究具有巨大益处尤为重要。各种DNA序列应该有很多种表现形式,具体的功能可以集中在学习模型前端的相应特征上。然而,目前对于非编码 DNA 调控功能的多类预测,大多数强大的预测模型都没有针对特定功能效应的适当特征提取和选择方法,因此很难更好地了解它们的内部相关性。因此,我们设计了一个类别注意层和类别密集层,以选择有效的特征并区分不同的 DNA 功能。在这项研究中,我们提出了一种称为 DeepATT 的混合深度神经网络方法,用于在近 500 万美元的 DNA 序列上识别 919 美元的监管功能。我们的模型有四个内置的神经网络结构:卷积层捕获调节图案,循环层捕获调节语法,类别注意层为不同的功能选择相应的有效特征,类别密集层用调节功能的选定特征对预测标签进行分类。重要的是,我们将我们的新方法 DeepATT 与现有的优秀预测工具 DeepSEA 和 DanQ 进行了比较。在识别 DNA 功能方面,DeepATT 的性能明显优于其他现有工具,至少在精确召回下增加 $1.6\%$ 区域。此外,我们可以根据类别注意力模块挖掘不同 DNA 功能之间的重要相关性。此外,我们的新模型可以在保证准确性的基础上,通过注意力机制和局部连接的机制大大减少参数的数量。
更新日期:2020-08-11
down
wechat
bug