当前位置: X-MOL 学术Metabolomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-block PLS discriminant analysis for the joint analysis of metabolomic and epidemiological data.
Metabolomics ( IF 3.6 ) Pub Date : 2019-10-03 , DOI: 10.1007/s11306-019-1598-y
Marion Brandolini-Bunlon 1 , Mélanie Pétéra 1 , Pierrette Gaudreau 2, 3 , Blandine Comte 4 , Stéphanie Bougeard 5 , Estelle Pujos-Guillot 1, 4
Affiliation  

INTRODUCTION Metabolomics is a powerful phenotyping tool in nutrition and health research, generating complex data that need dedicated treatments to enrich knowledge of biological systems. In particular, to investigate relations between environmental factors, phenotypes and metabolism, discriminant statistical analyses are generally performed separately on metabolomic datasets, complemented by associations with metadata. Another relevant strategy is to simultaneously analyse thematic data blocks by a multi-block partial least squares discriminant analysis (MBPLSDA) allowing determining the importance of variables and blocks in discriminating groups of subjects, taking into account data structure. OBJECTIVE The present objective was to develop a full open-source standalone tool, allowing all steps of MBPLSDA for the joint analysis of metabolomic and epidemiological data. METHODS This tool was based on the mbpls function of the ade4 R package, enriched with functionalities, including some dedicated to discriminant analysis. Provided indicators help to determine the optimal number of components, to check the MBPLSDA model validity, and to evaluate the variability of its parameters and predictions. RESULTS To illustrate the potential of this tool, MBPLSDA was applied to a real case study involving metabolomics, nutritional and clinical data from a human cohort. The availability of different functionalities in a single R package allowed optimizing parameters for an efficient joint analysis of metabolomics and epidemiological data to obtain new insights into multidimensional phenotypes. CONCLUSION In particular, we highlighted the impact of filtering the metabolomic variables beforehand, and the relevance of a MBPLSDA approach in comparison to a standard PLS discriminant analysis method.

中文翻译:

用于代谢组学和流行病学数据联合分析的多块PLS判别分析。

简介代谢组学是营养和健康研究中一个强大的表型分析工具,可生成复杂的数据,需要进行专门的处理以丰富生物系统的知识。特别是,为了研究环境因素,表型和代谢之间的关系,通常对代谢组学数据集进行区分统计分析,并辅以与元数据的关联。另一个相关策略是通过多块偏最小二乘判别分析(MBPLSDA)同时分析主题数据块,从而考虑到数据结构,从而确定变量和块在区分主题组中的重要性。目标目前的目标是开发一个完整的开源独立工具,允许MBPLSDA的所有步骤用于代谢组学和流行病学数据的联合分析。方法该工具基于ade4 R软件包的mbpls功能,功能丰富,包括一些专门用于判别分析的功能。提供的指标有助于确定最佳组件数量,检查MBPLSDA模型的有效性以及评估其参数和预测的可变性。结果为了说明该工具的潜力,将MBPLSDA应用于涉及人类人群的代谢组学,营养和临床数据的真实案例研究。单个R包中不同功能的可用性允许优化参数,以便对代谢组学和流行病学数据进行有效的联合分析,以获得对多维表型的新见解。结论特别是,
更新日期:2019-10-03
down
wechat
bug