Skip to main content

Advertisement

Log in

Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods

  • Tools and Techniques
  • Published:
Evolutionary Biology Aims and scope Submit manuscript

Abstract

Principal component analysis (PCA) is the most widely used dimensionality reduction technique in the biological sciences, and is commonly employed to create 2D visualizations of geometric morphometric data. However, interesting biological information may be lost or misrepresented in these plots due to PCA’s inability to summarize nonlinear dependencies between variables. Nonlinear alternative methods exist, but their effectiveness has never been tested on morphometric data. Here, the performance of PCA on the task of visualizing morphometric variation is compared to four nonlinear techniques: Sammon Mapping, Isomap, Locally Linear Embedding, and Laplacian Eigenmaps. The performance of methods is assessed on the basis of global and local preservation of pairwise distances for a variety of simulated and empirical datasets. The relative performance of PCA varies in function of the distribution of variation, complexity, and size of datasets. Overall, nonlinear methods show superior preservation of small differences between morphologies compared to PCA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Adams, D. C., & Collyer, M. L. (2018). Multivariate phylogenetic comparative methods: Evaluations, comparisons, and recommendations. Systematic Biology, 67(1), 14–31.

    Article  PubMed  Google Scholar 

  • Adams, D. C., Collyer, M. L., Kaliontzopoulou, A., & Sherratt, E. (2017). Geomorph: Geometric morphometric analyses of 2D/3D landmark data. R Package version 3.0.5. https://cran.r-project.org/package=geomorph.

  • Alberch, P. (1991). From genes to phenotype: Dynamical systems and evolvability. Genetica, 84(1), 5–11.

    Article  PubMed  CAS  Google Scholar 

  • Altenberg, L. (2005). Modularity in evolution: Some low-level questions. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems (pp. 99–128). Cambridge: MIT Press.

    Google Scholar 

  • Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. West Sussex: Wiley.

    Book  Google Scholar 

  • Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.

    Article  Google Scholar 

  • Bookstein, F. L. (1996). Combining the tools of geometric morphometrics. In L. F. Marcus, M. A. Loy, J. P. Naylor & D. E. Slice (Eds.), Advances in morphometrics (pp. 131–151). Boston: Springer.

    Chapter  Google Scholar 

  • Fontana, W., & Schuster, P. (1998). Shaping space: The possible and the attainable in RNA genotype–phenotype mapping. Journal of Theoretical Biology, 194(4), 491–515.

    Article  PubMed  CAS  Google Scholar 

  • Gerber, S. (2011). Comparing the differential filling of morphospace and allometric space through time: The morphological and developmental dynamics of Early Jurassic ammonoids. Paleobiology, 37(3), 369–382.

    Article  Google Scholar 

  • Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417–441.

    Article  Google Scholar 

  • Huttegger, S. M., & Mitteroecker, P. (2011). Invariance and meaningfulness in phenotype spaces. Evolutionary Biology, 38(3), 335–351.

    Article  Google Scholar 

  • Jernvall, J. (2000). Linking development with generation of novelty in mammalian teeth. Proceedings of the National Academy of Sciences, 97(6), 2641–2645.

    Article  CAS  Google Scholar 

  • Kaski, S., Nikkilä, J., Oja, M., Venna, J., Törönen, P., & Castrén, E. (2003). Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics, 4, 48.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kouropteva, O., Okun, O., & Pietikäinen, M. (2002). Selection of the optimal parameter value for the locally linear embedding algorithm. In Proceedings of the 1st international conference on fuzzy systems and knowledge discovery (pp. 359–363). Singapore.

  • Lawing, A. M., & Polly, P. D. (2010). Geometric morphometrics: Recent applications to the study of evolution and development. Journal of Zoology, 280(1), 1–7.

    Article  Google Scholar 

  • Lee, J. A., & Verleysen, M. (2007). Nonlinear dimensionality reduction. New York: Springer.

    Book  Google Scholar 

  • MATLAB and Statistics Toolbox. (Version 2018a). Natick: The MathWorks, Inc.

  • Meier, A., & Kramer, O. (2017). An experimental study of dimensionality reduction methods. In G. Kern-Isberner, J. Fürnkranz & M. Thimm (Eds.), Advances in artificial intelligence, lecture notes in computer science (pp. 178–192). Cham: Springer.

    Google Scholar 

  • Mitteroecker, P. (2009). The developmental basis of variational modularity: Insights from quantitative genetics, morphometrics, and developmental biology. Evolutionary Biology, 36(4), 377–385.

    Article  Google Scholar 

  • Mitteroecker, P., & Huttegger, S. M. (2009). The concept of morphospaces in evolutionary and developmental biology: Mathematics and metaphors. Biological Theory, 4(1), 54–67.

    Article  Google Scholar 

  • Niskanen, M., & Silvén, O. (2003). Comparison of dimensionality reduction methods for wood surface inspection. In Sixth international conference on quality control by artificial vision (pp. 178–189). Gatlinburg, TE, USA.

  • Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.

    Article  Google Scholar 

  • Polly, P. D. (2008). Developmental dynamics and g-matrices: Can morphometric spaces be used to model phenotypic evolution? Evolutionary Biology, 35(2), 83–96.

    Article  Google Scholar 

  • Polly, P. D., Lawing, A. M., Fabre, A.-C., & Goswami, A. (2013). Phylogenetic principal components analysis and geometric morphometrics. Hystrix, the Italian Journal of Mammalogy, 24(1), 33–41.

    Google Scholar 

  • Polly, P. D., & Motz, G. J. (2016). Patterns and processes in morphospace: Geometric morphometrics of three-dimensional objects. The Paleontological Society Papers, 22, 71–99.

    Article  Google Scholar 

  • Raup, D. M. (1961). The geometry of coiling in gastropods. Proceedings of the National Academy of Sciences, 47(4), 602–609.

    Article  CAS  Google Scholar 

  • Raup, D. M. (1966). Geometric analysis of shell coiling: General problems. Journal of Paleontology, 40(5), 1178–1190.

    Google Scholar 

  • R Core Team. (2018). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.

    Google Scholar 

  • Rohlf, F. J. (1999). Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification, 16(2), 197–233.

    Article  Google Scholar 

  • Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.

    Article  PubMed  CAS  Google Scholar 

  • Sakamoto, M., & Ruta, M. (2012). Convergence and divergence in the evolution of cat skulls: Temporal and spatial patterns of morphological diversity. PLoS ONE, 7(7), e39752.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Salazar-Ciudad, I., & Jernvall, J. (2010). A computational model of teeth and the developmental origins of morphological variation. Nature, 464(7288), 583–586.

    Article  PubMed  CAS  Google Scholar 

  • Samko, O., Marshall, A. D., & Rosin, P. L. (2006). Selection of the optimal parameter value for the Isomap algorithm. Pattern Recognition Letters, 27(9), 968–979.

    Article  Google Scholar 

  • Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, 18(5), 401–409.

    Article  Google Scholar 

  • Schuster, P., Fontana, W., Stadler, P. F., & Hofacker, I. L. (1994). From sequences to shapes and back: A case study in RNA secondary structures. Proceedings of the Royal Society of London B: Biological Sciences, 255(1344), 279–284.

    Article  CAS  Google Scholar 

  • Sidlauskas, B. (2008). Continuous and arrested morphological diversification in sister clades of characiform fishes: A phylomorphospace approach. Evolution, 62(12), 3135–3156.

    Article  PubMed  Google Scholar 

  • Stadler, B. M. R., Stadler, P. F., Wagner, G. P., & Fontana, W. (2001). The topology of the possible: Formal spaces underlying patterns of evolutionary change. Journal of Theoretical Biology, 213(2), 241–274.

    Article  PubMed  CAS  Google Scholar 

  • Tenenbaum, J. B., De Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.

    Article  PubMed  CAS  Google Scholar 

  • Torgerson, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika, 17(4), 401–419.

    Article  Google Scholar 

  • Uyeda, J. C., Caetano, D. S., & Pennell, M. W. (2015). Comparative analysis of principal components can be misleading. Systematic Biology, 64(4), 677–689.

    Article  PubMed  CAS  Google Scholar 

  • van der Maaten, L., Postma, E., & van den Herik, J. (2009). Dimensionality reduction: A comparative review (# TiCC-TR 2009-005). Tilburg: Tilburg University.

    Google Scholar 

  • Venna, J., & Kaski, S. (2007). Comparison of visualization methods for an atlas of gene expression data sets. Information Visualization, 6(2), 139–154.

    Article  Google Scholar 

  • Young, G., & Householder, A. S. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika, 3(1), 19–22.

    Article  Google Scholar 

  • Young, N. M., Hu, D., Lainoff, A. J., Smith, F. J., Diaz, R., Tucker, A. S., et al. (2014). Embryonic bauplans and the developmental origins of facial diversity and constraint. Development, 141(5), 1059–1063.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Acknowledgements

Thanks to D. Fowler and H. Larsson for advice, as well as A. Beauvais-Lacasse and A. Huot for help with coding. I am grateful to the Natural Sciences and Engineering Research Council of Canada (CGS-D) and le Fonds de recherche du Québec - Nature et technologies (BX3) for funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trina Y. Du.

Ethics declarations

Conflict of interest

The author has no conflicts of interest to declare.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, T.Y. Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods. Evol Biol 46, 106–121 (2019). https://doi.org/10.1007/s11692-018-9464-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11692-018-9464-9

Keywords

Navigation