The Age of Data-Driven Proteomics: How Machine Learning Enables Novel Workflows.,Proteomics

当前位置： X-MOL 学术 › Proteomics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Age of Data-Driven Proteomics: How Machine Learning Enables Novel Workflows.
Proteomics ( IF 3.4 ) Pub Date : 2020-04-08 , DOI: 10.1002/pmic.201900351
Robbin Bouwmeester _{1,

2} , Ralf Gabriels _{1,

2} , Tim Van Den Bossche _{1,

2} , Lennart Martens _{1,

2} , Sven Degroeve _{1,

2}

Affiliation

A lot of energy in the field of proteomics is dedicated to the application of challenging experimental workflows, which include metaproteomics, proteogenomics, data independent acquisition (DIA), non‐specific proteolysis, immunopeptidomics, and open modification searches. These workflows are all challenging because of ambiguity in the identification stage; they either expand the search space and thus increase the ambiguity of identifications, or, in the case of DIA, they generate data that is inherently more ambiguous. In this context, machine learning‐based predictive models are now generating considerable excitement in the field of proteomics because these predictive models hold great potential to drastically reduce the ambiguity in the identification process of the above‐mentioned workflows. Indeed, the field has already produced classical machine learning and deep learning models to predict almost every aspect of a liquid chromatography‐mass spectrometry (LC‐MS) experiment. Yet despite all the excitement, thorough integration of predictive models in these challenging LC‐MS workflows is still limited, and further improvements to the modeling and validation procedures can still be made. Therefore, highly promising recent machine learning developments in proteomics are pointed out in this viewpoint, alongside some of the remaining challenges.

中文翻译：

数据驱动的蛋白质组学时代：机器学习如何实现新的工作流程。

蛋白质组学领域的大量精力致力于具有挑战性的实验工作流程的应用，包括宏蛋白质组学、蛋白质基因组学、数据独立采集 (DIA)、非特异性蛋白质水解、免疫肽组学和开放修饰搜索。由于识别阶段的模糊性，这些工作流程都具有挑战性；它们要么扩大搜索空间，从而增加识别的模糊性，要么在 DIA 的情况下，生成本质上更模糊的数据。在这种情况下，基于机器学习的预测模型现在在蛋白质组学领域引起了相当大的轰动，因为这些预测模型具有极大的潜力，可以大大减少上述工作流程识别过程中的歧义。的确，该领域已经产生了经典的机器学习和深度学习模型，几乎可以预测液相色谱-质谱 (LC-MS) 实验的每个方面。然而，尽管令人兴奋，但预测模型在这些具有挑战性的 LC-MS 工作流程中的彻底集成仍然有限，并且仍然可以进一步改进建模和验证程序。因此，这一观点指出了最近在蛋白质组学中非常有前途的机器学习发展，以及一些剩余的挑战。并且仍然可以进一步改进建模和验证程序。因此，这一观点指出了最近在蛋白质组学中非常有前途的机器学习发展，以及一些剩余的挑战。并且仍然可以进一步改进建模和验证程序。因此，这一观点指出了最近在蛋白质组学中非常有前途的机器学习发展，以及一些剩余的挑战。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>