当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2024-01-18 , DOI: 10.1186/s13321-024-00804-5
Sadjad Fakouri Baygi , Dinesh Kumar Barupal

The majority of tandem mass spectrometry (MS/MS) spectra in untargeted metabolomics and exposomics studies lack any annotation. Our deep learning framework, Integrated Data Science Laboratory for Metabolomics and Exposomics—Mass INTerpreter (IDSL_MINT) can translate MS/MS spectra into molecular fingerprint descriptors. IDSL_MINT allows users to leverage the power of the transformer model for mass spectrometry data, similar to the large language models. Models are trained on user-provided reference MS/MS libraries via any customizable molecular fingerprint descriptors. IDSL_MINT was benchmarked using the LipidMaps database and improved the annotation rate of a test study for MS/MS spectra that were not originally annotated using existing mass spectral libraries. IDSL_MINT may improve the overall annotation rates in untargeted metabolomics and exposomics studies. The IDSL_MINT framework and tutorials are available in the GitHub repository at https://github.com/idslme/IDSL_MINT . Scientific contribution statement. Structural annotation of MS/MS spectra from untargeted metabolomics and exposomics datasets is a major bottleneck in gaining new biological insights. Machine learning models to convert spectra into molecular fingerprints can help in the annotation process. Here, we present IDSL_MINT, a new, easy-to-use and customizable deep-learning framework to train and utilize new models to predict molecular fingerprints from spectra for the compound annotation workflows.

中文翻译:

IDSL_MINT:从质谱预测分子指纹的深度学习框架

非靶向代谢组学和暴露组学研究中的大多数串联质谱 (MS/MS) 谱图缺乏任何注释。我们的深度学习框架代谢组学和暴露组学综合数据科学实验室 - Mass INTerterpreter (IDSL_MINT) 可以将 MS/MS 谱图转换为分子指纹描述符。IDSL_MINT 允许用户利用 Transformer 模型的强大功能来处理质谱数据,类似于大型语言模型。通过任何可定制的分子指纹描述符对用户提供的参考 MS/MS 库进行模型训练。IDSL_MINT 使用 LipidMaps 数据库进行基准测试,并提高了最初未使用现有质谱库注释的 MS/MS 谱图测试研究的注释率。IDSL_MINT 可以提高非靶向代谢组学和暴露组学研究中的总体注释率。IDSL_MINT 框架和教程可在 GitHub 存储库中获取,网址为 https://github.com/idslme/IDSL_MINT 。科学贡献声明。来自非目标代谢组学和暴露组学数据集的 MS/MS 谱的结构注释是获得新生物学见解的主要瓶颈。将光谱转换为分子指纹的机器学习模型可以帮助注释过程。在这里,我们推出了 IDSL_MINT,这是一种新的、易于使用且可定制的深度学习框架,用于训练和利用新模型来根据化合物注释工作流程的光谱预测分子指纹。
更新日期:2024-01-19
down
wechat
bug