当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online Feature Selection (OFS) with Accelerated Bat Algorithm (ABA) and Ensemble Incremental Deep Multiple Layer Perceptron (EIDMLP) for big data streams
Journal of Big Data ( IF 8.6 ) Pub Date : 2019-11-21 , DOI: 10.1186/s40537-019-0267-3
D. Renuka Devi , S. Sasikala

Feature selection is mainly used to lessen the dispensation load of data mining models. To condense the time for processing voluminous data, parallel processing is carried out with MapReduce (MR) technique. However with the existing algorithms, the performance of the classifiers needs substantial improvement. MR method, which is recommended in this research work, will perform feature selection in parallel which progresses the performance. To enhance the efficacy of the classifier, this research work proposes an innovative Online Feature Selection (OFS)–Accelerated Bat Algorithm (ABA) and a framework for applications that streams the features in advance with indefinite knowledge of the feature space. The concrete OFS-ABA method is suggested to select significant and non-superfluous feature with MapReduce (MR) framework. Finally, Ensemble Incremental Deep Multiple Layer Perceptron (EIDMLP) classifier is applied to classify the dataset samples. The outputs of homogeneous IDMLP classifiers were combined using the EIDMPL classifier. The projected feature selection method along with the classifier is evaluated expansively on three datasets of high dimensionality. In this research work, MR-OFS-ABA method has shown enhanced performance than the existing feature selection methods namely PSO, APSO and ASAMO (Accelerated Simulated Annealing and Mutation Operator). The result of the EIDMLP classifier is compared with other existing classifiers such as Naïve Bayes (NB), Hoeffding tree (HT), and Fuzzy Minimal Consistent Class Subset Coverage (FMCCSC)-KNN (K Nearest Neighbour). The methodology is applied to three datasets and results were compared with four classifiers and three state-of-the-art feature selection algorithms. The outcome of this research work has shown enhanced performance in accuracy and less processing time.

中文翻译:

带有加速Bat算法(ABA)和集合增量式深度多层感知器(EIDMLP)的在线特征选择(OFS)用于大数据流

特征选择主要用于减轻数据挖掘模型的分配负荷。为了缩短处理大量数据的时间,使用MapReduce(MR)技术执行了并行处理。然而,利用现有算法,分类器的性能需要实质性的提高。在这项研究工作中建议使用MR方法,该方法将并行执行特征选择,从而提高性能。为了提高分类器的效率,这项研究工作提出了一种创新的在线特征选择(OFS)–加速蝙蝠算法(ABA),以及一种应用程序框架,该应用程序可以在不确定特征空间的情况下预先传输特征。建议使用具体的OFS-ABA方法,通过MapReduce(MR)框架选择显着和非多余的特征。最后,集成增量深多层感知器(EIDMLP)分类器用于对数据集样本进行分类。使用EIDMPL分类器对同类IDMLP分类器的输出进行组合。在三个高维数据集上广泛评估了投影特征选择方法以及分类器。在这项研究工作中,MR-OFS-ABA方法显示出比现有特征选择方法(即PSO,APSO和ASAMO(加速模拟退火和变异算子))更高的性能。将EIDMLP分类器的结果与其他现有分类器(如朴素贝叶斯(NB),霍夫丁树(HT)和模糊最小一致性分类子集覆盖率(FMCCSC)-KNN(K最近邻)进行比较。将该方法应用于三个数据集,并将结果与​​四个分类器和三个最新的特征选择算法进行比较。这项研究工作的结果表明,在准确性和更少的处理时间方面性能得到了提高。
更新日期:2019-11-21
down
wechat
bug