Abstract
Principal component analysis (PCA) is the most widely used dimensionality reduction technique in the biological sciences, and is commonly employed to create 2D visualizations of geometric morphometric data. However, interesting biological information may be lost or misrepresented in these plots due to PCA’s inability to summarize nonlinear dependencies between variables. Nonlinear alternative methods exist, but their effectiveness has never been tested on morphometric data. Here, the performance of PCA on the task of visualizing morphometric variation is compared to four nonlinear techniques: Sammon Mapping, Isomap, Locally Linear Embedding, and Laplacian Eigenmaps. The performance of methods is assessed on the basis of global and local preservation of pairwise distances for a variety of simulated and empirical datasets. The relative performance of PCA varies in function of the distribution of variation, complexity, and size of datasets. Overall, nonlinear methods show superior preservation of small differences between morphologies compared to PCA.
Similar content being viewed by others
References
Adams, D. C., & Collyer, M. L. (2018). Multivariate phylogenetic comparative methods: Evaluations, comparisons, and recommendations. Systematic Biology, 67(1), 14–31.
Adams, D. C., Collyer, M. L., Kaliontzopoulou, A., & Sherratt, E. (2017). Geomorph: Geometric morphometric analyses of 2D/3D landmark data. R Package version 3.0.5. https://cran.r-project.org/package=geomorph.
Alberch, P. (1991). From genes to phenotype: Dynamical systems and evolvability. Genetica, 84(1), 5–11.
Altenberg, L. (2005). Modularity in evolution: Some low-level questions. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems (pp. 99–128). Cambridge: MIT Press.
Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. West Sussex: Wiley.
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.
Bookstein, F. L. (1996). Combining the tools of geometric morphometrics. In L. F. Marcus, M. A. Loy, J. P. Naylor & D. E. Slice (Eds.), Advances in morphometrics (pp. 131–151). Boston: Springer.
Fontana, W., & Schuster, P. (1998). Shaping space: The possible and the attainable in RNA genotype–phenotype mapping. Journal of Theoretical Biology, 194(4), 491–515.
Gerber, S. (2011). Comparing the differential filling of morphospace and allometric space through time: The morphological and developmental dynamics of Early Jurassic ammonoids. Paleobiology, 37(3), 369–382.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417–441.
Huttegger, S. M., & Mitteroecker, P. (2011). Invariance and meaningfulness in phenotype spaces. Evolutionary Biology, 38(3), 335–351.
Jernvall, J. (2000). Linking development with generation of novelty in mammalian teeth. Proceedings of the National Academy of Sciences, 97(6), 2641–2645.
Kaski, S., Nikkilä, J., Oja, M., Venna, J., Törönen, P., & Castrén, E. (2003). Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics, 4, 48.
Kouropteva, O., Okun, O., & Pietikäinen, M. (2002). Selection of the optimal parameter value for the locally linear embedding algorithm. In Proceedings of the 1st international conference on fuzzy systems and knowledge discovery (pp. 359–363). Singapore.
Lawing, A. M., & Polly, P. D. (2010). Geometric morphometrics: Recent applications to the study of evolution and development. Journal of Zoology, 280(1), 1–7.
Lee, J. A., & Verleysen, M. (2007). Nonlinear dimensionality reduction. New York: Springer.
MATLAB and Statistics Toolbox. (Version 2018a). Natick: The MathWorks, Inc.
Meier, A., & Kramer, O. (2017). An experimental study of dimensionality reduction methods. In G. Kern-Isberner, J. Fürnkranz & M. Thimm (Eds.), Advances in artificial intelligence, lecture notes in computer science (pp. 178–192). Cham: Springer.
Mitteroecker, P. (2009). The developmental basis of variational modularity: Insights from quantitative genetics, morphometrics, and developmental biology. Evolutionary Biology, 36(4), 377–385.
Mitteroecker, P., & Huttegger, S. M. (2009). The concept of morphospaces in evolutionary and developmental biology: Mathematics and metaphors. Biological Theory, 4(1), 54–67.
Niskanen, M., & Silvén, O. (2003). Comparison of dimensionality reduction methods for wood surface inspection. In Sixth international conference on quality control by artificial vision (pp. 178–189). Gatlinburg, TE, USA.
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.
Polly, P. D. (2008). Developmental dynamics and g-matrices: Can morphometric spaces be used to model phenotypic evolution? Evolutionary Biology, 35(2), 83–96.
Polly, P. D., Lawing, A. M., Fabre, A.-C., & Goswami, A. (2013). Phylogenetic principal components analysis and geometric morphometrics. Hystrix, the Italian Journal of Mammalogy, 24(1), 33–41.
Polly, P. D., & Motz, G. J. (2016). Patterns and processes in morphospace: Geometric morphometrics of three-dimensional objects. The Paleontological Society Papers, 22, 71–99.
Raup, D. M. (1961). The geometry of coiling in gastropods. Proceedings of the National Academy of Sciences, 47(4), 602–609.
Raup, D. M. (1966). Geometric analysis of shell coiling: General problems. Journal of Paleontology, 40(5), 1178–1190.
R Core Team. (2018). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Rohlf, F. J. (1999). Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification, 16(2), 197–233.
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.
Sakamoto, M., & Ruta, M. (2012). Convergence and divergence in the evolution of cat skulls: Temporal and spatial patterns of morphological diversity. PLoS ONE, 7(7), e39752.
Salazar-Ciudad, I., & Jernvall, J. (2010). A computational model of teeth and the developmental origins of morphological variation. Nature, 464(7288), 583–586.
Samko, O., Marshall, A. D., & Rosin, P. L. (2006). Selection of the optimal parameter value for the Isomap algorithm. Pattern Recognition Letters, 27(9), 968–979.
Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, 18(5), 401–409.
Schuster, P., Fontana, W., Stadler, P. F., & Hofacker, I. L. (1994). From sequences to shapes and back: A case study in RNA secondary structures. Proceedings of the Royal Society of London B: Biological Sciences, 255(1344), 279–284.
Sidlauskas, B. (2008). Continuous and arrested morphological diversification in sister clades of characiform fishes: A phylomorphospace approach. Evolution, 62(12), 3135–3156.
Stadler, B. M. R., Stadler, P. F., Wagner, G. P., & Fontana, W. (2001). The topology of the possible: Formal spaces underlying patterns of evolutionary change. Journal of Theoretical Biology, 213(2), 241–274.
Tenenbaum, J. B., De Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.
Torgerson, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika, 17(4), 401–419.
Uyeda, J. C., Caetano, D. S., & Pennell, M. W. (2015). Comparative analysis of principal components can be misleading. Systematic Biology, 64(4), 677–689.
van der Maaten, L., Postma, E., & van den Herik, J. (2009). Dimensionality reduction: A comparative review (# TiCC-TR 2009-005). Tilburg: Tilburg University.
Venna, J., & Kaski, S. (2007). Comparison of visualization methods for an atlas of gene expression data sets. Information Visualization, 6(2), 139–154.
Young, G., & Householder, A. S. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika, 3(1), 19–22.
Young, N. M., Hu, D., Lainoff, A. J., Smith, F. J., Diaz, R., Tucker, A. S., et al. (2014). Embryonic bauplans and the developmental origins of facial diversity and constraint. Development, 141(5), 1059–1063.
Acknowledgements
Thanks to D. Fowler and H. Larsson for advice, as well as A. Beauvais-Lacasse and A. Huot for help with coding. I am grateful to the Natural Sciences and Engineering Research Council of Canada (CGS-D) and le Fonds de recherche du Québec - Nature et technologies (BX3) for funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no conflicts of interest to declare.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Du, T.Y. Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods. Evol Biol 46, 106–121 (2019). https://doi.org/10.1007/s11692-018-9464-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11692-018-9464-9