当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimized hybrid investigative based dimensionality reduction methods for malaria vector using KNN classifier
Journal of Big Data ( IF 8.6 ) Pub Date : 2021-02-04 , DOI: 10.1186/s40537-021-00415-z
Micheal Olaolu Arowolo , Marion Olubunmi Adebiyi , Ayodele Ariyo Adebiyi , Oludayo Olugbara

RNA-Seq data are utilized for biological applications and decision making for the classification of genes. A lot of works in recent time are focused on reducing the dimension of RNA-Seq data. Dimensionality reduction approaches have been proposed in the transformation of these data. In this study, a novel optimized hybrid investigative approach is proposed. It combines an optimized genetic algorithm with Principal Component Analysis and Independent Component Analysis (GA-O-PCA and GAO-ICA), which are used to identify an optimum subset and latent correlated features, respectively. The classifier uses KNN on the reduced mosquito Anopheles gambiae dataset, to enhance the accuracy and scalability in the gene expression analysis. The proposed algorithm is used to fetch relevant features based on the high-dimensional input feature space. A fast algorithm for feature ranking is used to select relevant features. The performances of the model are evaluated and validated using the classification accuracy to compare existing approaches in the literature. The achieved experimental results prove to be promising for selecting relevant genes and classifying pertinent gene expression data analysis by indicating that the approach is capable of adding to prevailing machine learning methods.



中文翻译:

使用KNN分类器的基于混合调查的降维方法优化优化方法

RNA-Seq数据可用于生物学应用和基因分类的决策。近来的许多工作都集中在减小RNA-Seq数据的维度上。在这些数据的转换中已经提出了降维方法。在这项研究中,提出了一种新颖的优化混合调查方法。它结合了优化的遗传算法与主成分分析和独立成分分析(GA-O-PCA和GAO-ICA),分别用于识别最佳子集和潜在的相关特征。分类器在减少的蚊子冈比亚按蚊数据集上使用KNN,以提高基因表达分析的准确性和可扩展性。该算法用于基于高维输入特征空间获取相关特征。用于特征排名的快速算法用于选择相关特征。使用分类精度对模型的性能进行评估和验证,以比较文献中的现有方法。通过表明该方法能够添加到主流的机器学习方法中,所获得的实验结果被证明对于选择相关基因和对相关基因表达数据分析进行分类很有希望。

更新日期:2021-02-04
down
wechat
bug