Abstract
Multivariate methods have the potential to better capture complex relationships that may exist between different biological levels. Multiple Factor Analysis (MFA) is one of the most popular methods to obtain factor scores and measures of discrepancy between data sets. However, singular value decomposition in MFA is based on PCA, which is adequate only if the data is normally distributed, linear or stationary. In addition, including strongly correlated variables can overemphasize the contribution of the estimated components. In this work, we introduced a novel method referred as Independent Multifactorial Analysis (ICA-MFA) to derive relevant features from multiscale data. This method is an extended implementation of MFA, where the component value decomposition is based on Independent Component Analysis. In addition, ICA-MFA incorporates a predictive step based on an Independent Component Regression. We evaluated and compared the performance of ICA-MFA with both, the MFA method and traditional univariate analyses, in a simulation study. We showed how ICA-MFA explained up to 10-fold more variance than MFA and univariate methods. We applied the proposed algorithm in a study of 4057 individuals belonging to the population-based Rotterdam Study with available genetic and neuroimaging data, as well as information about executive cognitive functioning. Specifically, we used ICA-MFA to detect relevant genetic features related to structural brain regions, which in turn were involved, in the mechanisms of executive cognitive function. The proposed strategy makes it possible to determine the degree to which the whole set of genetic and/or neuroimaging markers contribute to the variability of the symptomatology jointly, rather than individually. While univariate results and MFA combinations only explained a limited proportion of variance (less than 2%), our method increased the explained variance (10%) and allowed the identification of significant components that maximize the variance explained in the model. The potential application of the ICA-MFA algorithm constitutes an important aspect of integrating multivariate multiscale data, specifically in the field of Neurogenetics.
Similar content being viewed by others
References
Abdi, H., Williams, L. J., & Valentin, D. (2013). Multiple factor analysis: Principal component analysis for multitable and multiblock data sets. Wiley Interdisciplinary Reviews: Computational Statistics, 5(2), 149–179. https://doi.org/10.1002/wics.1246.
Abi-Dargham, A., & Horga, G. (2016). The search for imaging biomarkers in psychiatric disorders. Nature Medicine, 22(11), 1248–1255. https://doi.org/10.1038/nm.4190.
Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle (pp. 199–213). Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1694-0_15.
Bair, E., Hastie, T., Paul, D., & Tibshirani, R. (2006). Prediction by supervised principal components. Journal of the American Statistical Association, 101(473), 119–137. https://doi.org/10.1198/016214505000000628.
Chen, L., & Huang, J. Z. (2012). Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection in Multivariate Regression. Retrieved from http://www.stat.yale.edu/~lc436/Chen_Huang_2012_JASA.pdf
Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287–314. https://doi.org/10.1016/0165-1684(94)90029-9.
Cruz-Cano, R., & Lee, M.-L. T. (2014). Fast regularized canonical correlation analysis. Computational Statistics & Data Analysis, 70, 88–100. https://doi.org/10.1016/J.CSDA.2013.09.020.
Curatolo, P., D’Agati, E., & Moavero, R. (2010). The neurobiological basis of ADHD. Italian Journal of Pediatrics, 36(1), 79. https://doi.org/10.1186/1824-7288-36-79.
Demontis, D., Walters, R. K., Martin, J., Mattheisen, M., Als, T. D., Agerbo, E., et al. (2018). Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nature Genetics, 51(1), 63–75. https://doi.org/10.1038/s41588-018-0269-7.
Durston, S. (2010). Imaging genetics in ADHD. Retrieved September 3, 2015, from http://www.ncbi.nlm.nih.gov/pubmed/20206707.
Fischl, B., Salat, D. H., van der Kouwe, A. J. W., Makris, N., Ségonne, F., Quinn, B. T., & Dale, A. M. (2004). Sequence-independent segmentation of magnetic resonance images. NeuroImage, 23, S69–S84. https://doi.org/10.1016/j.neuroimage.2004.07.016.
Härdle, W., & Simar, L. (2007). Applied Multivariate Statistical Analysis *. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?10.1.1.233.897&rep=rep1&type=pdf
Hoogendam, Y. Y., Hofman, A., van der Geest, J. N., van der Lugt, A., & Ikram, M. A. (2014). Patterns of cognitive function in aging: The Rotterdam study. European Journal of Epidemiology, 29(2), 133–140. https://doi.org/10.1007/s10654-014-9885-4.
Hoogman, M., Guadalupe, T., Zwiers, M. P., Klarenbeek, P., Francks, C., & Fisher, S. E. (2014). Assessing the effects of common variation in the FOXP2 gene on human brain structure. Frontiers in Human Neuroscience, 8(473). https://doi.org/10.3389/fnhum.2014.00473.
Husson, F., Lê, S., & Pagès, J. (2011). Exploratory multivariate analysis by example using R. CRC Press. Retrieved from https://www.crcpress.com/Exploratory-Multivariate-Analysis-by-Example-Using-R/Husson-Le-Pages/p/book/9781439835814
Hyvärinen, A. (2013). Independent component analysis: Recent advances. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences, 371(1984), 20110534. https://doi.org/10.1098/rsta.2011.0534.
Ikram, M. A., van der Lugt, A., Niessen, W. J., Koudstaal, P. J., Krestin, G. P., Hofman, A., Bos, D., & Vernooij, M. W. (2015). The Rotterdam scan study: Design update 2016 and main findings. European Journal of Epidemiology, 30(12), 1299–1315. https://doi.org/10.1007/s10654-015-0105-7.
Ikram, M. A., Brusselle, G. G. O., Murad, S. D., van Duijn, C. M., Franco, O. H., Goedegebure, A., Klaver, C. C. W., Nijsten, T. E. C., Peeters, R. P., Stricker, B. H., Tiemeier, H., Uitterlinden, A. G., Vernooij, M. W., & Hofman, A. (2017). The Rotterdam study: 2018 update on objectives, design and main results. European Journal of Epidemiology, 32(9), 807–850. https://doi.org/10.1007/s10654-017-0321-4.
Jolles, J., Houx, P. J., Van Boxtel, M. P. J., & Ponds, R. W. H. M. (2017). The Maastricht Aging Study: Determinants of cognitive aging. Retrieved from http://www.np.unimaas.nl/maas
Kawaguchi, A., Yamashita, F., & Alzheimer’s Disease Neuroimaging Initiative. (2017). OUP accepted manuscript. Biostatistics, 18(4), 651–665. https://doi.org/10.1093/biostatistics/kxx011.
Lever, J., Krzywinski, M., & Altman, N. (2017). Points of significance: Principal component analysis. Nature Methods, 14(7), 641–642. https://doi.org/10.1038/nmeth.4346.
Liu, J., & Calhoun, V. D. (2014). A review of multivariate analyses in imaging genetics. Frontiers in Neuroinformatics, 8(29). https://doi.org/10.3389/fninf.2014.00029.
Luo, J., Wu, M., Gopukumar, D., & Zhao, Y. (2016). Big data application in biomedical research and health care: A literature review. Biomedical Informatics Insights, 8, 1–10. https://doi.org/10.4137/BII.S31559.
Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglyak, L., Mardis, E., Rotimi, C. N., Slatkin, M., Valle, D., Whittemore, A. S., Boehnke, M., Clark, A. G., Eichler, E. E., Gibson, G., Haines, J. L., Mackay, T. F. C., McCarroll, S. A., & Visscher, P. M. (2009). Finding the missing heritability of complex diseases. Nature, 461(7265), 747–753. https://doi.org/10.1038/nature08494.
McCarthy, C. S., Ramprashad, A., Thompson, C., Botti, J.-A., Coman, I. L., & Kates, W. R. (2015). A comparison of FreeSurfer-generated data with and without manual intervention. Frontiers in Neuroscience, 9(379). https://doi.org/10.3389/fnins.2015.00379.
Medland, S. E., Jahanshad, N., Neale, B. M., & Thompson, P. M. (2014). Whole-genome analyses of whole-brain data: Working within an expanded search space. Nature Neuroscience, 17(6), 791–800. https://doi.org/10.1038/nn.3718.
Meyer-Lindenberg, A. (2012). The future of fMRI and genetics research. NeuroImage, 62(2), 1286–1292. https://doi.org/10.1016/j.neuroimage.2011.10.063.
Mueller, K. L., & Tomblin, J. B. (2012). Diagnosis of ADHD and its behavioral. Neurologic and Genetic Roots Topics in Language Disorders, 32(3), 207–227. https://doi.org/10.1097/TLD.0b013e318261ffdd.
Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572. https://doi.org/10.1080/14786440109462720.
Purper-Ouakil, D., Ramoz, N., Lepagnol-Bestel, A.-M., Gorwood, P., & Simonneau, M. (2011). Neurobiology of attention deficit/hyperactivity disorder. Pediatric Research, 69(5 Part 2), 69R–76R. https://doi.org/10.1203/PDR.0b013e318212b40f.
Rosipal, R., & Krämer, N. (2005). Overview and recent advances in partial least squares. In Notes in Computer Science https://doi.org/10.1007/11752790_2.
Sui, J., Adali, T., Yu, Q., Chen, J., & Calhoun, V. D. (2012). A review of multivariate methods for multimodal fusion of brain imaging data. Journal of Neuroscience Methods, 204(1), 68–81. https://doi.org/10.1016/j.jneumeth.2011.10.031.
van der Elst, W., van Boxtel, M. P. J., van Breukelen, G. J. P., & Jolles, J. (2006). The letter digit substitution test: Normative data for 1,858 healthy participants aged 24–81 from the Maastricht aging study (MAAS): Influence of age, education, and sex. Journal of Clinical and Experimental Neuropsychology, 28(6), 998–1009. https://doi.org/10.1080/13803390591004428.
Vilor-Tejedor, N., Cáceres, A., Pujol, J., Sunyer, J., & González, J. R. (2016). Imaging genetics in attention-deficit/hyperactivity disorder and related neurodevelopmental domains: State of the art. Brain Imaging and Behavior, 11, 1922–1931. https://doi.org/10.1007/s11682-016-9663-x.
Vilor-Tejedor, N., Alemany, S., Cáceres, A., Bustamante, M., Pujol, J., Sunyer, J., & González, J. R. (2018). Strategies for integrated analysis in imaging genetics studies. Neuroscience and Biobehavioral Reviews, 93, 57–70. https://doi.org/10.1016/j.neubiorev.2018.06.013.
Willcutt, E. G., Doyle, A. E., Nigg, J. T., Faraone, S. V., & Pennington, B. F. (2005). Validity of the executive function theory of attention-deficit/hyperactivity disorder: A meta-analytic review. Biological Psychiatry, 57(11), 1336–1346. https://doi.org/10.1016/j.biopsych.2005.02.006.
Acknowledgements
Natalia Vilor-Tejedor is funded by a pre-doctoral grant from the Agència de Gestió d’Ajuts Universitaris i de Recerca (2017 FI_B 00636), Generalitat de Catalunya – Fons Social Europeu. This work has been partially supported by a STSM Grant from EU COST Action 15120 Open Multiscale Systems Medicine (OpenMultiMed) and Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP). Further support was obtained through the Ministerio de Economía e Innovación (Spain), grant MTM2015-68140-R. ISGlobal is a member of the CERCA Programme, Generalitat de Catalunya.
Silvia Alemany thanks the Institute of Health Carlos III for her Sara Borrell postdoctoral grant (CD14/00214).
The generation and management of GWAS genotype data for the Rotterdam Study are supported by the Netherlands Organization of Scientific Research NWO Investments (no. 175.010.2005.011, 911-03-012). This study is funded by the Research Institute for Diseases in the Elderly (014-93-015; RIDE2), the Netherlands Genomics Initiative (NGI)/Netherlands Organization for Scientific Research (NWO) project no. 050-060-810. The Rotterdam Study is funded by Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. This research is supported by the Dutch Technology Foundation STW (12723), which is part of the NWO, and which is partly funded by the Ministry of Economic Affairs. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (project: ORACLE, grant agreement No: 678543).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
None.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
H. H. Adams and J. R. González co-last authors
Rights and permissions
About this article
Cite this article
Vilor-Tejedor, N., Ikram, M.A., Roshchupkin, G.V. et al. Independent Multiple Factor Association Analysis for Multiblock Data in Imaging Genetics. Neuroinform 17, 583–592 (2019). https://doi.org/10.1007/s12021-019-09416-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12021-019-09416-z