Abstract
In this article, we analyse the usefulness of multidimensional scaling in relation to performing K-means clustering on a dissimilarity matrix, when the dimensionality of the objects is unknown. In this situation, traditional algorithms cannot be used, and so K-means clustering procedures are being performed directly on the basis of the observed dissimilarity matrix. Furthermore, the application of criteria originally formulated for two-mode data sets to determine the number of clusters depends on their possible reformulation in a one-mode situation. The linear invariance property in K-means clustering for squared dissimilarities, together with the use of multidimensional scaling, is investigated to determine the cluster membership of the observations and to address the problem of selecting the number of clusters in K-means for a dissimilarity matrix. In particular, we analyse the performance of K-means clustering on the full dimensional scaling configuration and on the equivalently partitioned configuration related to a suitable translation of the squared dissimilarities. A Monte Carlo experiment is conducted in which the methodology examined is compared with the results obtained by procedures directly applicable to a dissimilarity matrix.
Similar content being viewed by others
References
Bailey, R. A., & Gower, J. C. (1990). Approximating a symmetric matrix. Psychometrika, 55, 665–675.
Borg, I. & Groenen, P. J. F. (2005). Modern multidimensional scaling. Theory and applications, Springer series in statistics, 2nd Ed. Springer.
Brusco, M. J., & Steinley, D. (2007). A comparison of heuristic procedures for minimum within-cluster sums of squares partitioning. Psychometrika, 72, 583–600.
Cailliez, F. (1983). The analytical solution of the additive constant problem. Psychometrika, 48(2), 305–308.
Calinski, R. B., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.
Chae, S. S., Dubien, J. L., & Warde, W. D. (2006). A method of predicting the number of clusters using rand’s statistic. Computational Statistics and Data Analysis, 50(12), 3531–3546.
Chiang, M. M., & Mirkin, B. (2010). Intelligent choice of the number of cluster in K-means clustering: an experimental study with different cluster spreads. Journal of Classification, 27, 3–40.
Cilibrasi, R. & Vitanyi, P. (2004). Automatic meaning discovery using Google. Technical Report (pp. 1–31). University of Amsterdam, National ICT of Australia.
De Leeuw, J., & Groenen, P. J. F. (1997). Inverse multidimensional scaling. Journal of Classification, 14, 3–21.
De Leeuw J. & Heiser W. J. (1980). Multidimensional scaling with restrictions on the configuration. In P.R. Krishnaiah (Ed.), Multivariate analysis (Vol. V, pp. 501–522). North-Holland.
DeSarbo, W., Carroll, J. D., Clark, L., & Green, P. (1984). Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables. Psychometrika, 49, 57–78.
Duin R. P. (2012). PRTools. http://www.prtools.org.
Everitt, B. S., Landau, S., Leese, M. & Stahl, D. (2011). Cluster analysis. Wiley series in probability and statistics (5th ed.). Wiley.
Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., & Zhang, J. (2008). Graph distances in the streaming model. SIAM Journal on Computing, 38(5), 1709–1727.
Hartigan, J. A. (1975). Clustering algorithms. Wiley
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A \(K\)-means clustering algorithm. Applied Statistics, 28, 100–108.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.
Heiser, W. J., & Groenen, P. J. F. (1997). Cluster differences scaling with a within-clusters loss component and a fuzzy succesive approximation strategy to avoid local minima. Psychometrika, 62(1), 63–83.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666.
Kak, S. (2002). A class of instantaneously trained neural networks. Information Sciences, 148, 97–102.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. Wiley.
Krzanowski, W. J., & Lai, Y. T. (1985). A criterion for determining the number of groups in a data set using sum of squares clustering. Biometrics, 44, 23–34.
Lichtenauer, J. F., Hendriks, E. A., & Reinders, M. J. T. (2008). Sign language recognition by combining statistical DTW and independent classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 2040–2046.
Lingoes, J. C. (1971). Some boundary conditions for a monotone analysis of symmetric matrices. Psychometrika, 36, 195–203.
Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(1982), 129–137.
Mardia, K. V. (1978). Some properties of clasical multi-dimesional scaling. Communications in Statistics-Theory and Methods, 7(13), 1233–1241.
Makarenkov, V., & Legendre, P. (2001). Optimal variable weighting for ultrametric and additive trees and K-means partitioning: Methods and software. Journal of Classification, 18, 245–271.
McQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In 5th Berkeley symposium on mathematical statistics and probability (Vol. II, pp. 281–297).
Melnykov, V., Chen, W.-C., & Maitra, R. (2012). MixSim: An R package for simulating data to study performance of clustering algorithms. Journal of Statistical Software, 51(12), 1–25.
Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.
Pekalska, E., Paclik, P., & Duin, R. P. (2001). A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2(Dec), 175–211.
Ramsay, J. O. (1982). Some statistical approaches to multidimensional scaling data. Journal of the Royal Statistical Society, A, 145, 285–312.
Schleif, F. M. (2015). Generic probabilistic prototype based classification of vectorial and proximity data. Neurocomputing, 154, 208–216.
Schleif, F. M., Chen, H. & Tino, P. (2015). Incremental probabilistic classification vector machine with linear costs. In Proceedings of IJCNN (Vol. 2015).
Schwarz, A. J. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Sebastian, T. B., Klein, P. N., & Kimia, B. B. (2001). Alignment-based recognition of shape outlines. In International workshop on visual form (pp. 606–618). Springer.
Steinley, D. (2006). \(K\)-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.
Steinley, D. (2008). Stability analysis in \(K\)-means clustering. British Journal of Mathematical and Statistical Psychology, 61, 255–273.
Steinley, D., & Brusco, M. J. (2007). Initializing \(K\)-means batch clustering: A critical evaluation of several techniques. Journal of Classification, 24, 99–121.
Steinley, D., & Brusco, M. J. (2011). Choosing the number of clusters in K-means clustering. Psychological Methods, 16(3), 285–297.
Steinley, D., & Hubert, L. (2008). Order constrained solutions in K-means clustering: Even better than being globally optimal. Psychometrika, 73(4), 647–664.
Sugar, C. A., & James, G. M. (2003). Finding the number of clusters in a dataset: An information-theoretic approach. Journal of the American Statistical Asssociation, 98, 750–762.
Takane, Y., Young, F., & de Leeuw, J. (1976). Non-metric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika, 42, 7–67.
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society B, 63, 411–423.
Vera, J. F. (2017). Distance stability analysis in multidimensional scaling using the jackknife method. British Journal of Mathematical and Statistical Psychology, 70, 25–41.
Vera, J. F., & Macías, R. (2017). Variance-based cluster selection criteria in a \(K\)-means framework for one-mode dissimilarity data. Psychometrika, 82(2), 275–294.
Vera, J. F., Macías, R., & Angulo, J. M. (2008). Non-stationary spatial covariance structure estimation in oversampled domains by cluster differences scaling with spatial constraints. Stochastic Environmental Research and Risk Assessment, 22, 95–106.
Vera, J. F., Macías, R., & Angulo, J. M. (2009). A latent class MDS model with spatial constraints for non-stationary spatial covariance estimation. Stochastic Environmental Research and Risk Assessment, 23(6), 769–779.
Vera, J. F., Macías, R., & Heiser, W. J. (2009a). A latent class multidimensional scaling model for two-way one-mode continuous rating dissimilarity data. Psychometrika, 74(2), 297–315.
Vera, J. F., Macías, R., & Heiser, W. J. (2009b). A dual latent class unfolding model for two-way two-mode preference rating data. Computational Statistics and Data Analysis, 53(8), 3231–3244.
Vera, J. F., Macías, R., & Heiser, W. J. (2013). Cluster differences unfolding for two-way two-mode preference rating data. Journal of Classification, 30, 370–396.
Witten, D. M., & Tibshirani, R. (2010). A framework for feature selection in clustering. Journal of the American Statistical Association, 105(490), 713–726.
Zhang, Y., Mandziuk, J., Quek, C. H., & Goh, B. W. (2017). Curvature-based method for determining the number of clusters. Information Sciences, 415, 414–428.
Acknowledgements
This work has been partially supported by Grants ECO2013-48413-R of the Spanish Ministry of Economy and Competitiveness, co-financed by FEDER, and RTI2018-099723-B-I00, Ministry of Science and Innovation—State Research Agency of Spain, co-financed by FEDER (J. Fernando Vera) and CB-252996, CONACYT, México (Rodrigo Macías).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Vera, J.F., Macías, R. On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling. Psychometrika 86, 489–513 (2021). https://doi.org/10.1007/s11336-021-09757-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-021-09757-2