当前位置: X-MOL 学术Metabolites › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine Learning Applications for Mass Spectrometry-Based Metabolomics.
Metabolites ( IF 4.1 ) Pub Date : 2020-06-13 , DOI: 10.3390/metabo10060243
Ulf W Liebal 1 , An N T Phan 1 , Malvika Sudhakar 2, 3, 4 , Karthik Raman 2, 3, 4 , Lars M Blank 1
Affiliation  

The metabolome of an organism depends on environmental factors and intracellular regulation and provides information about the physiological conditions. Metabolomics helps to understand disease progression in clinical settings or estimate metabolite overproduction for metabolic engineering. The most popular analytical metabolomics platform is mass spectrometry (MS). However, MS metabolome data analysis is complicated, since metabolites interact nonlinearly, and the data structures themselves are complex. Machine learning methods have become immensely popular for statistical analysis due to the inherent nonlinear data representation and the ability to process large and heterogeneous data rapidly. In this review, we address recent developments in using machine learning for processing MS spectra and show how machine learning generates new biological insights. In particular, supervised machine learning has great potential in metabolomics research because of the ability to supply quantitative predictions. We review here commonly used tools, such as random forest, support vector machines, artificial neural networks, and genetic algorithms. During processing steps, the supervised machine learning methods help peak picking, normalization, and missing data imputation. For knowledge-driven analysis, machine learning contributes to biomarker detection, classification and regression, biochemical pathway identification, and carbon flux determination. Of important relevance is the combination of different omics data to identify the contributions of the various regulatory levels. Our overview of the recent publications also highlights that data quality determines analysis quality, but also adds to the challenge of choosing the right model for the data. Machine learning methods applied to MS-based metabolomics ease data analysis and can support clinical decisions, guide metabolic engineering, and stimulate fundamental biological discoveries.

中文翻译:

基于质谱的代谢组学的机器学习应用程序。

生物体的代谢组取决于环境因素和细胞内调节,并提供有关生理状况的信息。代谢组学有助于了解临床环境中的疾病进展或评估代谢工程中代谢产物的过量生产。最受欢迎的分析代谢组学平台是质谱(MS)。但是,MS代谢组数据分析很复杂,因为代谢物之间存在非线性相互作用,并且数据结构本身也很复杂。由于固有的非线性数据表示以及快速处理大型和异构数据的能力,机器学习方法已成为统计分析的一种非常流行的方法。在这篇评论中 我们将探讨使用机器学习处理MS光谱的最新进展,并展示机器学习如何产生新的生物学见解。尤其是,监督型机器学习由于能够提供定量预测,因此在代谢组学研究中具有巨大的潜力。我们在这里回顾了常用的工具,例如随机森林,支持向量机,人工神经网络和遗传算法。在处理步骤中,受监督的机器学习方法可帮助进行峰值选择,归一化和丢失数据插补。对于知识驱动的分析,机器学习有助于生物标记物检测,分类和回归,生化途径识别以及碳通量测定。具有重要意义的是将不同的组学数据结合起来,以确定各种法规水平的贡献。我们对最近出版物的概述还强调了数据质量决定了分析质量,但也增加了为数据选择正确模型的挑战。应用于基于MS的代谢组学的机器学习方法可以简化数据分析,并可以支持临床决策,指导代谢工程并刺激基本的生物学发现。
更新日期:2020-06-13
down
wechat
bug