当前位置: X-MOL 学术Hum. Hered. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
tRNA-DL: A Deep Learning Approach to Improve tRNAscan-SE Prediction Results.
Human Heredity ( IF 1.8 ) Pub Date : 2019-01-28 , DOI: 10.1159/000493215
Xin Gao 1 , Zhi Wei 2 , Hakon Hakonarson 3, 4
Affiliation  

BACKGROUND tRNAscan-SE is the leading tool for transfer RNA (tRNA) annotation, which has been widely used in the field. However, tRNAscan-SE can return a significant number of false positives when applied to large sequences. Recently, conventional machine learning methods have been proposed to address this issue, but their efficiency can be still limited due to their dependency on handcrafted features. With the growing availability of large-scale genomic data-sets, deep learning methods, especially convolutional neural networks, have demonstrated excellent power in characterizing sequence patterns in genomic sequences. Thus, we hypothesize that deep learning may bring further improvement for tRNA prediction. METHODS We proposed a new computational approach based on deep neural networks to predict tRNA gene sequences. We designed and investigated various deep neural network architectures. We used the tRNA sequences as positive samples, and the false-positive tRNA sequences predicted by tRNAscan-SE in coding sequences as negative samples, to train and evaluate the proposed models by comparison with the conventional machine learning methods and popular tRNA prediction tools. RESULTS Using the one-hot encoding method, our proposed models can extract features without involving extensive manual feature engineering. Our proposed best model outperformed the existing methods under different performance metrics. CONCLUSION The proposed deep learning methods can substantially reduce the false positive output by the state-of-the-art tool tRNAscan-SE. Coupled with tRNAscan-SE, it can serve as a useful complementary tool for tRNA annotation. The application to tRNA prediction demonstrates the superiority of deep learning in automatic feature generation for characterizing sequence patterns.

中文翻译:

tRNA-DL:一种深度学习方法,可改善tRNAscan-SE的预测结果。

背景技术tRNAscan-SE是用于转移RNA(tRNA)注释的领先工具,该工具已在该领域中广泛使用。但是,tRNAscan-SE在应用于大序列时会返回大量假阳性。近来,已经提出了传统的机器学习方法来解决这个问题,但是由于它们对手工特征的依赖性,它们的效率仍然受到限制。随着大规模基因组数据集的可用性不断增长,深度学习方法(尤其是卷积神经网络)在表征基因组序列中的序列模式方面已显示出卓越的能力。因此,我们假设深度学习可能为tRNA预测带来进一步的改善。方法我们提出了一种基于深度神经网络的新计算方法来预测tRNA基因序列。我们设计并研究了各种深度神经网络架构。我们将tRNA序列用作阳性样本,并将tRNAscan-SE在编码序列中预测的假阳性tRNA序列用作阴性样本,以通过与常规机器学习方法和流行的tRNA预测工具进行比较来训练和评估所提出的模型。结果使用单热编码方法,我们提出的模型可以提取特征,而无需进行大量的手动特征工程。在不同的性能指标下,我们提出的最佳模型优于现有方法。结论所提出的深度学习方法可以通过最先进的工具tRNAscan-SE大大减少假阳性输出。与tRNAscan-SE结合使用,它可以作为tRNA注释的有用补充工具。
更新日期:2019-11-01
down
wechat
bug