Skip to main content
Log in

MapReduce FCM clustering set algorithm

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Fuzzy C-means clustering integration algorithm is a method to improve clustering quality by using integration ideas, but as the amount of data increases, its time complexity increases. A parallel FCM clustering integration algorithm based on MapReduce is proposed. The algorithm uses a random initial clustering centre to obtain differentiated cluster members. By establishing an overlapping matrix between clusters, the clustering labels are unified to find logical equivalence clusters. The cluster members share the classification information of the data objects by voting to obtain the final clustering result. The experimental results show that the parallel FCM clustering integration algorithm has good performance, and has high speedup and good scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Couceiro, M., Sivasundaram, S.: Novel fractional-order particle swarm optimization. Appl. Math. Comput. 283, 36–54 (2016)

    MathSciNet  MATH  Google Scholar 

  2. Tamerabet, Y., Adjadj, F., Bentrcia, T.: Evaluation of the genetic algorithm performance for the optimization of the grand potential in the cluster variation method. CALPHAD 61, 157–164 (2018)

    Article  Google Scholar 

  3. Gao, T., Li, A., Meng, F.: Research on data stream clustering based on fcm algorithm1. Procedia Comput. Sci. 122, 595–602 (2017)

    Article  Google Scholar 

  4. Li, F., Qian, Y., Wang, J., Liang, J.: Multigranulation information fusion: a Dempster–Shafer evidence theory-based clustering ensemble method. Inf. Sci. 378, 389–409 (2017)

    Article  Google Scholar 

  5. Syakur, M.A., Khotimah, B.K., Rochman, E.M.S., Satoto, B.D.: Integration k-means clustering method and elbow method for identification of the best customer profile cluster. In: IOP Conference Series: Materials Science and Engineering, vol. 336(1), p. 012017). IOP Publishing, Bristol (2018).

  6. Anagnostopoulos, I., Zeadally, S., Exposito, E.: Handling big data: research challenges and future directions. J. Supercomput. 72(4), 1494–1516 (2016)

    Article  Google Scholar 

  7. Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015)

    Article  MathSciNet  Google Scholar 

  8. Glushkova, D., Jovanovic, P., Abelló, A.: Mapreduce performance model for Hadoop 2.x. Inf. Syst. 79, 32–43 (2019). Special issue on DOLAP 2017: Design, Optimization, Languages and Analytical Processing of Big Data.

  9. Kesemen, O., Tezel, Ö., Özkul, E.: Fuzzy c-means clustering algorithm for directional data (FCM4DD). Expert Syst. Appl. 58, 76–82 (2016)

    Article  Google Scholar 

  10. Ruspini, E.H., Bezdek, J.C., Keller, J.M.: Fuzzy clustering: a historical perspective. IEEE Comput. Intell. Mag. 14(1), 45–55 (2019)

    Article  Google Scholar 

  11. Yu, Q., Ding, Z.: An improved fuzzy C-means algorithm based on MapReduce. In: 2015 8th International Conference on Biomedical Engineering and Informatics (BMEI), pp. 634–638. IEEE, Shenzhen (2015).

  12. Maitrey, S., Jha, C.K.: MapReduce: simplified data analysis of big data. Procedia Comput. Sci, 57, 563–571 (2015)

    Article  Google Scholar 

  13. Jung, Y.G., Kang, M.S., Heo, J.: Clustering performance comparison using K-means and expectation-maximization algorithms. Biotechnol. Biotechnol. Equip. 28(sup1), S44–S48 (2014)

    Article  Google Scholar 

  14. Bhavani, R., Sudha Sadasivam, G.: 8 parallel data mining. In: Medical Big Data and Internet of Medical Things: Advances, Challenges and Applications (2018)

  15. Ludwig, S.A.: MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability. Int. J. Mach. Learn. Cybern. 6(6), 923–934 (2015)

    Article  Google Scholar 

  16. Jin, S., Cui, Y., Yu, C.: A new parallelization method for K-means. arXiv preprint. arXiv:1608.06347 (2016).

  17. Yu, Q., Ding, Z.:. An improved fuzzy C-Means algorithm based on MapReduce. In: 2015 8th International Conference on Biomedical Engineering and Informatics (BMEI), October 2015, pp. 634–638). IEEE, Shenzhen (2015).

  18. Sardar, T.H., Ansari, Z.: An analysis of MapReduce efficiency in document clustering using parallel K-means algorithm. Fut. Comput. Inf. J. 3(2), 200–209 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bonzou Adolphe Kouassi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mbyamm Kiki, M.J., Zhang, J. & Kouassi, B.A. MapReduce FCM clustering set algorithm. Cluster Comput 24, 489–500 (2021). https://doi.org/10.1007/s10586-020-03131-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-020-03131-0

Keywords

Navigation