当前位置: X-MOL 学术Comput. Math. Method Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysis of DNA Sequence Classification Using CNN and Hybrid Models
Computational and Mathematical Methods in Medicine ( IF 2.809 ) Pub Date : 2021-07-16 , DOI: 10.1155/2021/1835056
Hemalatha Gunasekaran 1 , K Ramalakshmi 2 , A Rex Macedo Arokiaraj 1 , S Deepa Kanmani 3 , Chandran Venkatesan 4 , C Suresh Gnana Dhas 5
Affiliation  

In a general computational context for biomedical data analysis, DNA sequence classification is a crucial challenge. Several machine learning techniques have used to complete this task in recent years successfully. Identification and classification of viruses are essential to avoid an outbreak like COVID-19. Regardless, the feature selection process remains the most challenging aspect of the issue. The most commonly used representations worsen the case of high dimensionality, and sequences lack explicit features. It also helps in detecting the effect of viruses and drug design. In recent days, deep learning (DL) models can automatically extract the features from the input. In this work, we employed CNN, CNN-LSTM, and CNN-Bidirectional LSTM architectures using Label and -mer encoding for DNA sequence classification. The models are evaluated on different classification metrics. From the experimental results, the CNN and CNN-Bidirectional LSTM with -mer encoding offers high accuracy with 93.16% and 93.13%, respectively, on testing data.

中文翻译:

使用 CNN 和混合模型分析 DNA 序列分类

在生物医学数据分析的一般计算环境中,DNA 序列分类是一项至关重要的挑战。近年来,有几种机器学习技术成功地完成了这项任务。病毒的识别和分类对于避免像 COVID-19 这样的爆发至关重要。无论如何,特征选择过程仍然是问题中最具挑战性的方面。最常用的表示使高维情况恶化,并且序列缺乏明确的特征。它还有助于检测病毒和药物设计的影响。最近几天,深度学习 (DL) 模型可以自动从输入中提取特征。在这项工作中,我们使用了 CNN、CNN-LSTM 和 CNN-Bidirectional LSTM 架构,并使用了标签和-用于DNA序列分类的mer编码。这些模型在不同的分类指标上进行评估。从实验结果看,在CNN和CNN-双向LSTM带-滨海编码提供高精确度分别为93.16%和93.13%,对测试数据。
更新日期:2021-07-16
down
wechat
bug