WOA + BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network,Soft Computing

当前位置： X-MOL 学术 › Soft Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

WOA + BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network
Soft Computing ( IF 3.1 ) Pub Date : 2019-03-11 , DOI: 10.1007/s00500-019-03901-y
Eslam. M. Hassib , Ali. I. El-Desouky , Labib. M. Labib , El-Sayed M. El-kenawy

Abstract

Nowadays, big data plays a substantial part in information knowledge analysis, manipulation, and forecasting. Analyzing and extracting knowledge from such big datasets are a very challenging task due to the imbalance of data distribution, which could lead to a biased classification results and wrong decisions. The standard classifiers are not capable of handling such datasets. Hence, a new technique for dealing with such datasets is required. This paper proposes a novel classification framework for big data that consists of three developed phases. The first phase is the feature selection phase, which uses the Whale optimization algorithm (WOA) for finding the best set of features. The second phase is the preprocessing phase, which uses the SMOTE algorithm and the LSH-SMOTE algorithm for solving the class imbalance problem. Lastly, the third phase is WOA + BRNN algorithm, which is using the Whale optimization algorithm for training a deep learning approach called bidirectional recurrent neural network for the first time. Our proposed algorithm WOA-BRNN has been tested against nine highly imbalanced datasets one of them is big dataset in terms of area under curve (AUC) against four of the most common use machine learning algorithms (Naïve Bayes, AdaBoostM1, decision table, random tree), in addition to GWO-MLP (training multilayer perceptron using Gray Wolf Optimizer), then we test our algorithm over four well-known datasets against GWO-MLP and particle swarm optimization (PSO-MLP), genetic algorithm (GA-MLP), ant colony optimization (ACO-MLP), evolution strategy (ES-MLP), and population-based incremental learning (PBIL-MLP) in terms of classification accuracy. Experimental results proved that our proposed algorithm WOA + BRNN has achieved promising accuracy and high local optima avoidance, and outperformed four of the most common use machine learning algorithms, and GWO-MLP in terms of AUC.

中文翻译：

WOA + BRNN：使用鲸鱼优化和深度神经网络的不平衡大数据分类框架

摘要

如今，大数据在信息知识分析，操纵和预测中起着重要作用。由于数据分布不平衡，因此从如此大的数据集中分析和提取知识是一项非常具有挑战性的任务，这可能会导致分类结果有偏差和决策错误。标准分类器无法处理此类数据集。因此，需要一种用于处理此类数据集的新技术。本文提出了一个新的大数据分类框架，该框架包括三个发展阶段。第一阶段是特征选择阶段，该阶段使用鲸鱼优化算法（WOA）查找最佳特征集。第二阶段是预处理阶段，它使用SMOTE算法和LSH-SMOTE算法来解决类不平衡问题。最后，第三阶段是WOA + BRNN算法，该算法首次使用Whale优化算法训练一种称为双向递归神经网络的深度学习方法。我们针对9个高度不平衡的数据集测试了我们提出的算法WOA-BRNN，其中之一是针对四种最常用的机器学习算法（朴素贝叶斯（NaïveBayes），AdaBoostM1，决策表，随机树）在曲线下面积（AUC）方面是大型数据集），除了GWO-MLP（使用Gray Wolf Optimizer训练多层感知器）之外，我们还针对GWO-MLP和粒子群优化（PSO-MLP），遗传算法（GA-MLP）在四个知名数据集上测试了我们的算法，蚁群优化（ACO-MLP），进化策略（ES-MLP）和基于种群的增量学习（PBIL-MLP）。

更新日期：2020-03-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11