当前位置: X-MOL 学术J. Proteome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast metabolite identification in nuclear magnetic resonance metabolomic studies: statistical peak sorting and peak overlap detection for more reliable database queries
Journal of Proteome Research ( IF 4.4 ) Pub Date : 2017-11-14 00:00:00 , DOI: 10.1021/acs.jproteome.7b00617
Pablo A. Hoijemberg 1, 2 , István Pelczer 1
Affiliation  

A lot of time is spent by researchers in the identification of metabolites in NMR-based metabolomic studies. The usual metabolite identification starts employing public or commercial databases to match chemical shifts thought to belong to a given compound. Statistical total correlation spectroscopy (STOCSY), in use for more than a decade, speeds the process by finding statistical correlations among peaks, being able to create a better peak list as input for the database query. However, the (normally not automated) analysis becomes challenging due to the intrinsic issue of peak overlap, where correlations of more than one compound appear in the STOCSY trace. Here we present a fully automated methodology that analyzes all STOCSY traces at once (every peak is chosen as driver peak) and overcomes the peak overlap obstacle. Peak overlap detection by clustering analysis and sorting of traces (POD-CAST) first creates an overlap matrix from the STOCSY traces, then clusters the overlap traces based on their similarity and finally calculates a cumulative overlap index (COI) to account for both strong and intermediate correlations. This information is gathered in one plot to help the user identify the groups of peaks that would belong to a single molecule and perform a more reliable database query. The simultaneous examination of all traces reduces the time of analysis, compared to viewing STOCSY traces by pairs or small groups, and condenses the redundant information in the 2D STOCSY matrix into bands containing similar traces. The COI helps in the detection of overlapping peaks, which can be added to the peak list from another cross-correlated band. POD-CAST overcomes the generally overlooked and underestimated presence of overlapping peaks and it detects them to include them in the search of all compounds contributing to the peak overlap, enabling the user to accelerate the metabolite identification process with more successful database queries and searching all tentative compounds in the sample set.

中文翻译:

核磁共振代谢组学研究中的快速代谢物鉴定:统计峰分类和峰重叠检测可提供更可靠的数据库查询

在基于NMR的代谢组学研究中,研究人员花费了大量时间来鉴定代谢物。通常的代谢物鉴定开始使用公共或商业数据库来匹配被认为属于给定化合物的化学位移。统计总相关光谱法(STOCSY)使用了十多年,它通过查找峰之间的统计相关性来加快此过程,从而能够创建更好的峰列表作为数据库查询的输入。但是,由于峰重叠的内在问题(在STOCSY迹线中出现了不止一种化合物的相关性),(通常不是自动的)分析变得具有挑战性。在这里,我们提出了一种全自动的方法,可以一次分析所有STOCSY迹线(每个峰都选作驱动峰),并克服了峰重叠的障碍。通过聚类分析和迹线排序(POD-CAST)进行峰重叠检测,首先从STOCSY迹线创建重叠矩阵,然后根据相似度对重叠迹线进行聚类,最后计算累积重叠指数(COI)以同时考虑强和强中间相关。在一个图中收集该信息,以帮助用户识别属于单个分子的峰组并执行更可靠的数据库查询。与按对或小组查看STOCSY迹线相比,同时检查所有迹线可减少分析时间,并将2D STOCSY矩阵中的冗余信息浓缩为包含相似迹线的波段。COI有助于检测重叠的峰,这些峰可以从另一个互相关的波段添加到峰列表中。
更新日期:2017-11-14
down
wechat
bug