Abstract
In comparing clustering partitions, the Rand index (RI) and the adjusted Rand index (ARI) are commonly used for measuring the agreement between partitions. Such external validation indexes can be used to quantify how close the clusters are to a reference partition (or to prior knowledge about the data) by counting classified pairs of elements. To evaluate the solution of a fuzzy clustering algorithm, several extensions of the Rand index and other similarity measures to fuzzy partitions have been proposed. An extension of the ARI for fuzzy partitions based on the normalized degree of concordance is proposed. The performance of the proposed index is evaluated through Monte Carlo simulation studies.
Similar content being viewed by others
References
Albatineh, A. N., & Niewiadomska-Bugaj, M. (2011). Correcting Jaccard and other similarity indices for chance agreement in cluster analysis. Advances in Data Analysis and Classification, 5(3), 179–200.
Albatineh, A. N., Niewiadomska-Bugaj, M., & Mihalko, D. (2006). On similarity indices and correction for chance agreement. Journal of Classification, 23(2), 301–313.
Anderberg, M. R. (1973). Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks, 1st edn. New York: Academic press.
Anderson, D. T., Bezdek, J. C., Popescu, M., & Keller, J. M. (2010). Comparing fuzzy, probabilistic, and possibilistic partitions. IEEE Transactions on Fuzzy Systems, 18(5), 906–918.
Ben-Israel, A., & Iyigun, C. (2008). Probabilistic d-clustering. Journal of Classification, 25(1), 5–26.
Berkhin, P. (2006). A survey of clustering data mining techniques, in Grouping multidimensional data. In Kogan, J., Nicholas, C., & Teboulle, M. (Eds.) (pp. 25–71). Berlin: Springer.
Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: the Fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2), 191–203.
Böck, H. H. (1974). Automatische Klassifikation, 1st edn. Göttingen: Vandenhoeck & Ruprecht.
Brouwer, R. K. (2009). Extending the Rand, adjusted Rand and Jaccard indices to fuzzy partitions. Journal of Intelligent Information Systems, 32(3), 213–235.
Campello, R. J. (2007). A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment. Pattern Recognition Letters, 28(7), 833–841.
Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297–302.
Downton, M., & Brennan, T. (1980). Comparing classifications: an evaluation of several coefficients of partition agreement. Classification Society Bulletin, 4(4), 53–54.
Duran, B. S., & Odell, P. L. (2013). Cluster analysis: a survey, 2nd edn. Heidelberg: Springer Science & Business Media.
D’Urso, P. (2015). Fuzzy clustering, in Handbook of cluster analysis. In Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (Eds.) (pp. 545–574). Boca Raton: CRC Press, chap. 24.
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis, 5th edn. Chichester: Wiley.
Fasulo, D. (1999). An analysis of recent work on clustering algorithms. Department of Computer Science & Engineering, University of Washington. Available at https://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.2946&rep=rep1&type=pdf.
Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553–569.
Frigui, H., Hwang, C., & Rhee, F. C. -H. (2007). Clustering and aggregation of relational data with applications to image database categorization. Pattern Recognition, 40(11), 3053–3068.
Gower, J. C., & Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients. Journal of classification, 3(1), 5–48.
Halkidi, M., Vazirgiannis, M., & Hennig, C. (2015). Method-independent indices for cluster validation and estimating the number of clusters. In Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (Eds.) Handbook of cluster analysis, chap. 26 (pp. 595–618). Boca Raton: CRC Press.
Hamann, U. (1961). Merkmalsbestand und Verwandtschaftsbeziehungen der Farinosae: ein beitrag zum system der Monokotyledonen (639–768). Willdenowia.
Han, J., Pei, J., & Kamber, M. (2012). Data mining: concepts and techniques, 3rd edn. Amsterdam: Elsevier.
Hartigan, J. A. (1975). Clustering algorithms, 1st edn. New York: Wiley.
Hennig, C., & Meila, M. (2015). Cluster analysis: an overview, in Handbook of cluster analysis. In Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (Eds.) (pp. 1–20). Boca Raton: CRC Press, chap. 1.
Hoeffding, W. (1952). The large-sample power of tests based on permutations of observations. The Annals of Mathematical Statistics, 169–192.
Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy cluster analysis: methods for classification, data analysis and image recognition, 1st edn. Chichester: Wiley.
Hubert, L. (1977). Nominal scale response agreement as a generalized correlation. British Journal of Mathematical and Statistical Psychology, 30(1), 98–103.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Hüllermeier, E., Rifqi, M., Henzgen, S., & Senge, R. (2012). Comparing fuzzy partitions: a generalization of the Rand index and related measures. IEEE Transactions on Fuzzy Systems, 20(3), 546–556.
Jaccard, P. (1901). Distribution de la Flore Alpine: dans le Bassin des dranses et dans quelques régions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles, 37(140), 241–272.
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data, 1st edn. Englewood Cliffs: Prentice-Hall, Inc.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM Computing Surveys (CSUR), 31(3), 264–323.
Kaufman, L., & Rousseeuw, P. J. (2005). Finding groups in data: an introduction to cluster analysis, 2nd. Hoboken: Wiley.
Klement, E. P., Mesiar, R., & Pap, E. (2010). Triangular norms, 1st edn. Dordercht: Springer Science & Business Media.
Kulczynski, S. (1927). Die pflanzenassociationen der pienenen. Bulletin International de l’académie Polonaise des Sciences et des letters, classe des sciences mathemátiques et naturelles, Serie B Supplement, II, 2, 57–203.
Meilă, M. (2007). Comparing clusterings - an information based distance. Journal of Multivariate Analysis, 98(5), 873–895.
Mirkin, B. (1998). Mathematical classification and clustering: from how to what and why. In Balderjahn, I., Mathar, R., & Schader, M. (Eds.) Classification, data analysis, and data highways (pp. 172–181). Heidelberg: Springer.
Morey, L. C., & Agresti, A. (1984). The measurement of classification agreement: an adjustment to the Rand statistic for chance agreement. Educational and Psychological Measurement, 44(1), 33–37.
Pesarin, F., & Salmaso, L. (2010a). The permutation testing approach: a review. Statistica, 70(4), 481–509.
Pesarin, F., & Salmaso, L. (2010b). Permutation tests for complex data: theory, applications and software, 1st edn. Chippenham: Wiley.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850.
Ruspini, E. H. (1970). Numerical methods for fuzzy clustering. Information Sciences, 2(3), 319–350.
Spath, H. (1980). Cluster analysis algorithms for data reduction and classification of objects, 1st edn. Chichester: Ellis Horwood, Ltd.
Stahl, D., & Sallis, H. (2012). Model-based cluster analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 4(4), 341–358.
Suleman, A. (2017). Assessing a fuzzy extension of Rand index and related measures. IEEE Transactions on Fuzzy Systems, 25(1), 237–244.
Warrens, M. J. (2008a). On association coefficients for 2× 2 tables and properties that do not depend on the marginal distributions. Psychometrika, 73(4), 777–789.
Warrens, M. J. (2008b). On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index. Journal of Classification, 25(2), 177–183.
Warrens, M. J., & van der Hoef, H. (2019). Understanding partition comparison indices based on counting object pairs. Available at arXiv:1901.01777.
Acknowledgments
The authors would like to thank both the Editor and an anonymuous reviewer, whose comments and remarks highly contributed to improve the quality of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
D’Ambrosio, A., Amodio, S., Iorio, C. et al. Adjusted Concordance Index: an Extensionl of the Adjusted Rand Index to Fuzzy Partitions. J Classif 38, 112–128 (2021). https://doi.org/10.1007/s00357-020-09367-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-020-09367-0