Skip to main content
Log in

Relative density-based clustering algorithm for identifying diverse density clusters effectively

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Clustering is an important part of data mining. The existing clustering algorithm failed in the data set with uneven density distribution. In this paper, we propose a novel clustering algorithm relative density-based clustering algorithm for identifying diverse density clusters effectively called IDDC. It can effectively identify clusters in data sets with different densities and can also handle outliers. We first compute relative density for each data point. Then, the density peak points are screened and the initial clusters are obtained according to these peak points. The strategy for assigning the remaining points is to find unallocated points from the perspective of the cluster, which can effectively identify different density. In experiments, we compare the proposed algorithm IDDC with some existing algorithms on synthetic and real-world data sets. The results show that IDDC performs better than those existing algorithms, especially clustering on data set with uneven density distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Albatineh AN, Niewiadomska-Bugaj M, Mihalko D (2006) On similarity indices and correction for chance agreement. J Classif 23(2):301–313

    Article  MathSciNet  Google Scholar 

  2. Bartel HG, Mucha HJ, Dolata J (2003) On a modification of a graph theory based partitioning method in cluster analysis. Match Commun Math Comput Chem 48(48):1070–1070

    MATH  Google Scholar 

  3. Baulieu FB (1989) A classification of presence/absence based dissimilarity coefficients. J Classif 6(1):233–246

    Article  MathSciNet  Google Scholar 

  4. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, pp 93–104. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/342009.335388

  5. Friedman J, Hastie T, Tibshirani R (2009) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  6. Cai D, He X, Han J, Huang TS (2011) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560

    Article  Google Scholar 

  7. Cheng D, Zhang S, Huang J (2020) Dense members of local cores-based density peaks clustering algorithm. Knowl Based Syst 193:105454

    Article  Google Scholar 

  8. Deng C, He X, Han J (2011) Speed up kernel discriminant analysis. Vldb J 20(1):21–33

    Article  Google Scholar 

  9. Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99(may1):135–145

    Article  Google Scholar 

  10. Dua D, Graff C (2017) UCI machine learning repository . http://archive.ics.uci.edu/ml

  11. Ester M (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the international conference knowledge discovery and data mining

  12. Fränti P, Sieranoja S (2018) K-means properties on six clustering benchmark datasets. http://cs.uef.fi/sipu/datasets/

  13. Fu L, Medico E (2007) Flame, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8:1–15

    Article  Google Scholar 

  14. Gao Y, Chen G, Li Q, Zheng B, Li C (2008) Processing mutual nearest neighbor queries for moving object trajectories. In: International conference on mobile data management

  15. Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4-es

    Article  Google Scholar 

  16. Han J, Kamber M, Jian P (2011) Data mining: concepts and techniques: concepts and techniques. Data Min Concepts Models Methods Algorithms Second Ed 5(4):1–18

    MATH  Google Scholar 

  17. He L, Wu L, Cai Y (2007) Survey of clustering algorithms in data mining. Appl Res Comput 24(1):10–13

    Google Scholar 

  18. Hong C, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41(1):191–203

    Article  Google Scholar 

  19. Huang X, Ye Y, Zhang H (2014) Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation. IEEE Trans Neural Netw Learn Syst 25(8):1433–1446. https://doi.org/10.1109/TNNLS.2013.2293795

    Article  Google Scholar 

  20. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  21. Jain AK, Law MHC (2005) Data clustering: a user’s dilemma. Lect Notes Comput Sci 3776:1–10

    Article  Google Scholar 

  22. Liu QB, Deng S, Lu CH, Wang B, Zhou YF (2003) Relative density based k-nearest neighbors clustering algorithm. In: International conference on machine learning and cybernetics

  23. Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226

    Article  MathSciNet  Google Scholar 

  24. Mitsch S, Müller A, Retschitzegger W, Salfinger A, Schwinger W (2013) A survey on clustering techniques for situation awareness. In: Asia-pacific web conference

  25. Donald Michie D, Spiegelhalter J, Taylor CC, Campbell J (eds) (1994) Machine learning, neural and statistical classification. Ellis Horwood, USA. https://www.freetechbooks.com/machine-learning-neural-and-statistical-classification-t500.html

  26. Olafsson S, Li X, Wu S (2008) Operations research and data mining. Eur J Oper Res 187(3):1429–1448

    Article  MathSciNet  Google Scholar 

  27. Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883

    Article  Google Scholar 

  28. Rate C, Retrieval C (2011) Columbia Object Image Library (COIL-20). In: Nene SA, Nayar SK, Murase H (eds) Technical Report CUCS-005-96, February 1996

  29. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  30. Rui Xu, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  31. Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280

    Article  Google Scholar 

  32. Wah WB (2007) Wiley encyclopedia of computer science and engineering. Pattern Recogn

  33. Xiao L, Zhou L, Zhang X, Hui XU, Yang Z (2016) Study of reactive power control partitioning method with spectral cluster analysis based on PCA. Shaanxi Electr Power 44(12):23–28

    Google Scholar 

  34. Xie J, Xiong ZY, Zhang YF, Feng Y, Ma J (2018) Density core-based clustering algorithm with dynamic scanning radius. Knowl Based Syst 142:58–70

    Article  Google Scholar 

  35. Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40

    Article  Google Scholar 

  36. Chen Y, Tang S, Zhou L, Wang C, Du J, Wang T, Pei S (2018) Decentralized clustering by finding loose and distributed density cores. Inf Sci 433:510–26

    Article  MathSciNet  Google Scholar 

  37. Zahn CT, Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 20(1):68–86

    Article  Google Scholar 

  38. Zhou Z, Si G, Zhang Y, Zheng K (2018) Robust clustering by identifying the veins of clusters based on kernel density estimation. Knowl Based Syst 159:309–320

    Article  Google Scholar 

  39. Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recogn Lett, p S016786551630085X

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China grant 61573266.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuying Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Yang, Y. Relative density-based clustering algorithm for identifying diverse density clusters effectively. Neural Comput & Applic 33, 10141–10157 (2021). https://doi.org/10.1007/s00521-021-05777-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05777-2

Keywords

Navigation