当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Clustering Method Using Enhanced Grey Wolf Optimizer and MapReduce
Big Data Research ( IF 3.5 ) Pub Date : 2018-05-21 , DOI: 10.1016/j.bdr.2018.05.002
Ashish Kumar Tripathi , Kapil Sharma , Manju Bala

With advancement of the technology, data size is increasing rapidly. For making intelligent decisions based on data, efficacious analytic methods are required. Data clustering, a prominent analytic method of data mining, is being efficiently employed in data analytics. To analyze massive data sets, the improvement in the traditional methods is the urge of todays scenario. In this paper, an efficient clustering method, MapReduce based enhanced grey wolf optimizer (MR-EGWO), is presented for clustering large-scale data sets. The proposed method introduced a novel variant of grey wolf optimizer, Enhanced grey wolf optimizer (EGWO), where the hunting strategy of grey wolf is hybridized with binomial crossover and lévy flight steps are inducted to enhance the searching capability for pray. Further, the proposed variant is used for optimizing the clustering process. The clustering efficiency of the EGWO is tested on seven UCI benchmark datasets and compared with the five existing clustering techniques namely K-Means, particle swarm optimization (PSO), gravitational search algorithm (GSA), bat algorithm (BA) and grey wolf optimizer (GWO). The convergence behavior and consistency of the EGWO has been validated through the convergence graph and boxplots. Further, the proposed EGWO is parallelized on the MapReduce model in the Hadoop framework and named MR-EGWO to handle the large-scale datasets. Moreover, the clustering quality of the MR-EGWO is also validated in terms of F-measure and compared with four MapReduce based state-of-the-art namely; parallel K-Means, parallel K-PSO, MapReduce based artificial bee colony optimization (MR-ABC), dynamic frequency based parallel k-bat algorithm (DFBPKBA). Experimental results affirm that the proposed technique is promising and powerful alternative for the efficient and large-scale data clustering.



中文翻译:

使用增强型灰狼优化器和MapReduce的新型聚类方法

随着技术的进步,数据大小正在迅速增加。为了基于数据做出明智的决策,需要有效的分析方法。数据聚类是一种重要的数据挖掘分析方法,已在数据分析中得到有效利用。为了分析海量数据集,对传统方法的改进是当今情形的迫切需求。本文提出了一种有效的聚类方法,基于MapReduce的增强型灰狼优化器(MR-EGWO),用于聚类大规模数据集。所提出的方法引入了一种新的灰狼优化器变体,即增强型灰狼优化器(EGWO),该方法将灰狼的狩猎策略与二项式交叉混合,并引入了lévy飞行步骤以增强祈祷的搜索能力。进一步,提出的变体用于优化聚类过程。EGWO的聚类效率在七个UCI基准数据集中进行了测试,并与五种现有聚类技术(即K-Means,粒子群优化(PSO),重力搜索算法(GSA),蝙蝠算法(BA)和灰狼优化器( GWO)。EGWO的收敛行为和一致性已通过收敛图和箱形图进行了验证。此外,将拟议的EGWO在Hadoop框架中的MapReduce模型上并行化,并命名为MR-EGWO来处理大规模数据集。此外,MR-EGWO的聚类质量也可以通过F度量进行验证,并且可以与四个基于MapReduce的最新技术进行比较。并行K均值,并行K-PSO,基于MapReduce的人工蜂群优化(MR-ABC),基于动态频率的并行k-bat算法(DFBPKBA)。实验结果表明,所提出的技术是有效且大规模的数据聚类的有前途的有力替代方法。

更新日期:2018-05-21
down
wechat
bug