当前位置: X-MOL 学术Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Moth-Flame Optimization-Bat Optimization: Map-Reduce Framework for Big Data Clustering Using the Moth-Flame Bat Optimization and Sparse Fuzzy C-Means.
Big Data ( IF 4.6 ) Pub Date : 2020-05-19 , DOI: 10.1089/big.2019.0125
Vasavi Ravuri 1 , S Vasundra 2
Affiliation  

The technical advancements in big data have become popular and most desirable among users for storing, processing, and handling huge data sets. However, clustering using these big data sets has become a major challenge in big data analysis. The conventional clustering algorithms used scalable solutions for managing huge data sets. Thus, this study proposes a technique for big data clustering using the spark architecture. The proposed technique undergoes two steps for clustering the big data, involving feature selection and clustering, performed in the initial cluster nodes of spark architecture. At first, the initial cluster nodes read the big data from various distributed systems, and the optimal features are selected and placed in the feature vector based on the proposed moth-flame optimization-based bat (MFO-Bat) algorithm, which is designed by integrating MFO and Bat algorithms. Then, the selected features are fed to the final cluster nodes of spark, which uses the sparse-fuzzy C-means method for performing optimal clustering. The performance of proposed MFO-Bat outperformed other existing methods with a maximal classification accuracy of 95.806%, Dice coefficient of 99.181%, and Jaccard coefficient of 98.376%, respectively.

中文翻译:

蛾-火焰优化-蝙蝠优化:使用蛾-火焰蝙蝠优化和稀疏模糊C-均值的Map-Reduce大数据聚类框架。

大数据的技术进步已成为用户在存储,处理和处理海量数据集方面最受欢迎的方法。但是,使用这些大数据集进行聚类已成为大数据分析中的主要挑战。传统的群集算法使用可伸缩的解决方案来管理海量数据集。因此,本研究提出了一种使用Spark体系结构进行大数据聚类的技术。所提出的技术需要经历两个步骤来对大数据进行聚类,包括在Spark体系结构的初始聚类节点中执行的特征选择和聚类。首先,初始群集节点从各种分布式系统中读取大数据,然后根据拟议的基于飞蛾优化的蝙蝠(MFO-Bat)算法选择最佳特征并将其放置在特征向量中,通过集成MFO和Bat算法进行设计。然后,将选定的特征馈送到spark的最终聚类节点,该节点使用稀疏模糊C均值方法执行最佳聚类。提出的MFO-Bat的性能优于其他现有方法,最大分类精度分别为95.806%,Dice系数为99.181%和Jaccard系数为98.376%。
更新日期:2020-05-19
down
wechat
bug