当前位置: X-MOL 学术Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal Feature Selection for Big Data Classification: Firefly with Lion-Assisted Model.
Big Data ( IF 4.6 ) Pub Date : 2020-04-17 , DOI: 10.1089/big.2019.0022
Ramar Senthamil Selvi 1 , Muniyappan Lakshapalam Valarmathi 2
Affiliation  

In this article, the proposed method develops a big data classification model with the aid of intelligent techniques. Here, the Parallel Pool Map reduce Framework is used for handling big data. The model involves three main phases, namely (1) feature extraction, (2) optimal feature selection, and (3) classification. For feature extraction, the well-known feature extraction techniques such as principle component analysis, linear discriminate analysis, and linear square regression are used. Since the length of feature vector tends to be high, the choice of the optimal features is complex task. Hence, the proposed model utilizes the optimal feature selection technology referred as Lion-based Firefly (L-FF) algorithm to select the optimal features. The main objective of this article is projected on minimizing the correlation between the selected features. It results in providing diverse information regarding the different classes of data. Once, the optimal features are selected, the classification algorithm called neural network (NN) is adopted, which effectively classify the data in an effective manner with the selected features. Furthermore, the proposed L-FF+NN model is compared with the traditional methods and proves the effectiveness over other methods. Experimental analysis shows that the proposed L-FF+NN model is 92%, 28%, 87%, 82%, and 78% superior to the state-of-art models such as GA+NN, FF+NN, PSO+NN, ABC+NN, and LA+NN, respectively.

中文翻译:

大数据分类的最佳功能选择:Lion辅助模型的Firefly。

在本文中,所提出的方法借助智能技术开发了大数据分类模型。在这里,并行池映射缩减框架用于处理大数据。该模型涉及三个主要阶段,即(1)特征提取,(2)最佳特征选择和(3)分类。对于特征提取,使用了众所周知的特征提取技术,例如主成分分析,线性判别分析和线性平方回归。由于特征向量的长度往往较高,因此选择最佳特征是一项复杂的任务。因此,提出的模型利用称为基于狮子的萤火虫(L-FF)算法的最佳特征选择技术来选择最佳特征。本文的主要目标旨在最小化所选功能之间的相关性。这样就可以提供有关不同类别数据的各种信息。一旦选择了最佳特征,就采用了称为神经网络(NN)的分类算法,该算法可以根据所选特征对有效数据进行有效分类。此外,将所提出的L-FF + NN模型与传统方法进行了比较,证明了其优于其他方法的有效性。实验分析表明,所提出的L-FF + NN模型比GA + NN,FF + NN,PSO + NN等最新模型优越92%,28%,87%,82%和78% ,ABC + NN和LA + NN。采用称为神经网络(NN)的分类算法,该算法根据所选特征有效地对数据进行有效分类。此外,将所提出的L-FF + NN模型与传统方法进行了比较,证明了其优于其他方法的有效性。实验分析表明,所提出的L-FF + NN模型比GA + NN,FF + NN,PSO + NN等最新模型优越92%,28%,87%,82%和78% ,ABC + NN和LA + NN。采用称为神经网络(NN)的分类算法,该算法根据所选特征有效地对数据进行有效分类。此外,将所提出的L-FF + NN模型与传统方法进行了比较,证明了其优于其他方法的有效性。实验分析表明,所提出的L-FF + NN模型比GA + NN,FF + NN,PSO + NN等最新模型优越92%,28%,87%,82%和78% ,ABC + NN和LA + NN。
更新日期:2020-04-17
down
wechat
bug