当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data Mining Algorithm for Cloud Network Information Based on Artificial Intelligence Decision Mechanism
IEEE Access ( IF 3.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/access.2020.2981632
Yuan Huang , Zhe Cheng , Qianyu Zhou , Yuxing Xiang , Ruixiao Zhao

Due to the rapid development of information technology and network technology, there is a lot of data, but the phenomenon of lack of knowledge is becoming more and more serious. Data mining technology has developed vigorously in this environment, and it has shown more and more vitality. Based on Spark programming model, this paper designs the parallel extension of fuzzy c-means. In order to enhance the performance of fuzzy c-means parallel expansion, the improvement strategy of k-means during the initialization phase is borrowed, and k-means// is extended to fuzzy c-means to obtain better clustering performance. Combined with Spark’s programming model, this paper can obtain extended parallel fuzzy c-means algorithm. Several experiments on the data set of the algorithm proposed in this paper have shown good scalability and parallelism, effectively expanding fuzzy c-means clustering to distributed applications, greatly increasing the scale of the data processed by the algorithm. This improves the robustness of the algorithm and the adaptability of the algorithm to the shape and structure of the data, so that the parallel and scalable clustering algorithm can more effectively perform cluster analysis on big data. Three algorithms were simulated on MATLAB platform. We use simple data sets and complex two-dimensional data sets, and compare with the traditional fuzzy c-means algorithm and fuzzy c-means algorithm based on fuzzy entropy. Experiments show that the scalable parallel fuzzy c-means algorithm not only greatly improves the anti-noise performance, but also improves the convergence speed, and it can automatically determine the optimal number of clusters.

中文翻译:

基于人工智能决策机制的云网络信息数据挖掘算法

由于信息技术和网络技术的飞速发展,数据量很大,但知识匮乏的现象越来越严重。数据挖掘技术在这种环境下蓬勃发展,并显示出越来越多的生命力。本文基于Spark编程模型,设计了模糊c-means的并行扩展。为了增强模糊c-means并行扩展的性能,借鉴了初始化阶段k-means的改进策略,将k-means//扩展到模糊c-means,以获得更好的聚类性能。结合Spark的编程模型,本文可以得到扩展的并行模糊c-means算法。在本文提出的算法的数据集上的多次实验都显示出良好的可扩展性和并行性,有效地将模糊c-means聚类扩展到分布式应用中,大大增加了算法处理的数据规模。这提高了算法的鲁棒性和算法对数据形状和结构的适应性,使并行、可扩展的聚类算法能够更有效地对大数据进行聚类分析。在MATLAB平台上对三种算法进行了仿真。我们使用简单的数据集和复杂的二维数据集,并与传统的模糊c-means算法和基于模糊熵的模糊c-means算法进行比较。实验表明,可扩展并行模糊c-means算法不仅大大提高了抗噪性能,而且提高了收敛速度,并且可以自动确定最佳簇数。
更新日期:2020-01-01
down
wechat
bug