Relative density-based clustering algorithm for identifying diverse density clusters effectively

Wang, Yuying; Yang, Youlong

doi:10.1007/s00521-021-05777-2

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Original Article
Published: 13 March 2021

Volume 33, pages 10141–10157, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

559 Accesses
10 Citations
Explore all metrics

Abstract

Clustering is an important part of data mining. The existing clustering algorithm failed in the data set with uneven density distribution. In this paper, we propose a novel clustering algorithm relative density-based clustering algorithm for identifying diverse density clusters effectively called IDDC. It can effectively identify clusters in data sets with different densities and can also handle outliers. We first compute relative density for each data point. Then, the density peak points are screened and the initial clusters are obtained according to these peak points. The strategy for assigning the remaining points is to find unallocated points from the perspective of the cluster, which can effectively identify different density. In experiments, we compare the proposed algorithm IDDC with some existing algorithms on synthetic and real-world data sets. The results show that IDDC performs better than those existing algorithms, especially clustering on data set with uneven density distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

Article 02 August 2019

Density Normalization in Density Peak Based Clustering

Clustering of Multiple Density Peaks

References

Albatineh AN, Niewiadomska-Bugaj M, Mihalko D (2006) On similarity indices and correction for chance agreement. J Classif 23(2):301–313
Article MathSciNet Google Scholar
Bartel HG, Mucha HJ, Dolata J (2003) On a modification of a graph theory based partitioning method in cluster analysis. Match Commun Math Comput Chem 48(48):1070–1070
MATH Google Scholar
Baulieu FB (1989) A classification of presence/absence based dissimilarity coefficients. J Classif 6(1):233–246
Article MathSciNet Google Scholar
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, SIGMOD ’00, pp 93–104. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/342009.335388
Friedman J, Hastie T, Tibshirani R (2009) The elements of statistical learning. Springer, New York
MATH Google Scholar
Cai D, He X, Han J, Huang TS (2011) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Article Google Scholar
Cheng D, Zhang S, Huang J (2020) Dense members of local cores-based density peaks clustering algorithm. Knowl Based Syst 193:105454
Article Google Scholar
Deng C, He X, Han J (2011) Speed up kernel discriminant analysis. Vldb J 20(1):21–33
Article Google Scholar
Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99(may1):135–145
Article Google Scholar
Dua D, Graff C (2017) UCI machine learning repository . http://archive.ics.uci.edu/ml
Ester M (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the international conference knowledge discovery and data mining
Fränti P, Sieranoja S (2018) K-means properties on six clustering benchmark datasets. http://cs.uef.fi/sipu/datasets/
Fu L, Medico E (2007) Flame, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8:1–15
Article Google Scholar
Gao Y, Chen G, Li Q, Zheng B, Li C (2008) Processing mutual nearest neighbor queries for moving object trajectories. In: International conference on mobile data management
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4-es
Article Google Scholar
Han J, Kamber M, Jian P (2011) Data mining: concepts and techniques: concepts and techniques. Data Min Concepts Models Methods Algorithms Second Ed 5(4):1–18
MATH Google Scholar
He L, Wu L, Cai Y (2007) Survey of clustering algorithms in data mining. Appl Res Comput 24(1):10–13
Google Scholar
Hong C, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41(1):191–203
Article Google Scholar
Huang X, Ye Y, Zhang H (2014) Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation. IEEE Trans Neural Netw Learn Syst 25(8):1433–1446. https://doi.org/10.1109/TNNLS.2013.2293795
Article Google Scholar
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Jain AK, Law MHC (2005) Data clustering: a user’s dilemma. Lect Notes Comput Sci 3776:1–10
Article Google Scholar
Liu QB, Deng S, Lu CH, Wang B, Zhou YF (2003) Relative density based k-nearest neighbors clustering algorithm. In: International conference on machine learning and cybernetics
Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226
Article MathSciNet Google Scholar
Mitsch S, Müller A, Retschitzegger W, Salfinger A, Schwinger W (2013) A survey on clustering techniques for situation awareness. In: Asia-pacific web conference
Donald Michie D, Spiegelhalter J, Taylor CC, Campbell J (eds) (1994) Machine learning, neural and statistical classification. Ellis Horwood, USA. https://www.freetechbooks.com/machine-learning-neural-and-statistical-classification-t500.html
Olafsson S, Li X, Wu S (2008) Operations research and data mining. Eur J Oper Res 187(3):1429–1448
Article MathSciNet Google Scholar
Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883
Article Google Scholar
Rate C, Retrieval C (2011) Columbia Object Image Library (COIL-20). In: Nene SA, Nayar SK, Murase H (eds) Technical Report CUCS-005-96, February 1996
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Article Google Scholar
Rui Xu, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Article Google Scholar
Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280
Article Google Scholar
Wah WB (2007) Wiley encyclopedia of computer science and engineering. Pattern Recogn
Xiao L, Zhou L, Zhang X, Hui XU, Yang Z (2016) Study of reactive power control partitioning method with spectral cluster analysis based on PCA. Shaanxi Electr Power 44(12):23–28
Google Scholar
Xie J, Xiong ZY, Zhang YF, Feng Y, Ma J (2018) Density core-based clustering algorithm with dynamic scanning radius. Knowl Based Syst 142:58–70
Article Google Scholar
Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40
Article Google Scholar
Chen Y, Tang S, Zhou L, Wang C, Du J, Wang T, Pei S (2018) Decentralized clustering by finding loose and distributed density cores. Inf Sci 433:510–26
Article MathSciNet Google Scholar
Zahn CT, Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 20(1):68–86
Article Google Scholar
Zhou Z, Si G, Zhang Y, Zheng K (2018) Robust clustering by identifying the veins of clusters based on kernel density estimation. Knowl Based Syst 159:309–320
Article Google Scholar
Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recogn Lett, p S016786551630085X

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China grant 61573266.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xidian University, Xi’an, 710071, China
Yuying Wang & Youlong Yang

Authors

Yuying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Youlong Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuying Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Yang, Y. Relative density-based clustering algorithm for identifying diverse density clusters effectively. Neural Comput & Applic 33, 10141–10157 (2021). https://doi.org/10.1007/s00521-021-05777-2

Download citation

Received: 11 August 2020
Accepted: 28 January 2021
Published: 13 March 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s00521-021-05777-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Abstract

Access this article

Similar content being viewed by others

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

Density Normalization in Density Peak Based Clustering

Clustering of Multiple Density Peaks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Abstract

Access this article

Similar content being viewed by others

Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging

Density Normalization in Density Peak Based Clustering

Clustering of Multiple Density Peaks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation