当前位置: X-MOL 学术Anal. Chim. Acta › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image classification combined with faster R–CNN for the peak detection of complex components and their metabolites in untargeted LC-HRMS data
Analytica Chimica Acta ( IF 5.7 ) Pub Date : 2022-07-21 , DOI: 10.1016/j.aca.2022.340189
Jun Zeng 1 , Hai Wu 2 , Min He 1
Affiliation  

Peak detection of untargeted liquid chromatography-high resolution mass spectrometry (LC-HRMS) data is a key step to identify the metabolic status of the drugable chemicals and extracts from functional foods or herbs. Nevertheless, the existing approaches are difficult to obtain ideal results with low false positives and false negatives. In this paper, we proposed an automatic method based on convolutional neural network (CNN) for image classification and Faster R–CNN for peak location/classification in untargeted LC-HRMS data, and named it Peak_CF. It can achieve detection of target peaks with high accuracy and high recall (both >90%) as verified by an evaluation data-set. In terms of detecting the m/z peaks of known compounds, Peak_CF is better than Peakonly, and it can effectively have an overall peak shape judgment of split peaks. For the same evaluation data, the recall of MZmine2 (ADAP) is slightly higher than that of Peak_CF, however, the F1 score of Peak_CF is higher, indicating that it has higher accuracy. In addition, the Peak_ CF training model with strong generalization ability can be achieved and verified. At last, Peak_CF was applied in real metabolic fingerprints of total flavonoids from Glycyrrhiza uralensis Fisch, also a contrast was conducted based on 40 m/z peaks of 40 prototypes in serum data-set. The result showed that the recall rate of Peak_CF and Peakonly all reached 95%, higher than 70% of MZmine2 (ADAP), and Peak_CF is more accurate when detecting EIC that has serious drifts. In conclusion, Peak_CF provides a new route for data mining of LC-HRMS datasets of drug (or herbs, or functional foods) metabolites.



中文翻译:

图像分类结合更快的 R-CNN 用于非目标 LC-HRMS 数据中复杂成分及其代谢物的峰检测

非靶向液相色谱-高分辨率质谱 (LC-HRMS) 数据的峰检测是确定药物化学物质和功能性食品或草药提取物代谢状态的关键步骤。然而,现有的方法很难获得具有低误报和漏报的理想结果。在本文中,我们提出了一种基于卷积神经网络 (CNN) 的图像分类自动方法和基于 Faster R-CNN 的非目标 LC-HRMS 数据峰值定位/分类的自动方法,并将其命名为 Peak_CF。正如评估数据集所验证的那样,它可以以高精度和高召回率(均> 90%)实现目标峰的检测。在检测m/z方面对于已知化合物的峰,Peak_CF优于Peakonly,能有效对分裂峰进行整体峰形判断。对于相同的评估数据,MZmine2(ADAP)的召回率略高于Peak_CF,但Peak_CF的F1分数更高,说明其准确率更高。此外,可以实现并验证泛化能力强的Peak_CF训练模型。最后,将Peak_CF应用于甘草总黄酮的真实代谢指纹图谱,并基于40  m/z进行对比血清数据集中 40 个原型的峰。结果表明,Peak_CF和Peakonly的召回率均达到95%,高于MZmine2(ADAP)的70%,Peak_CF在检测漂移严重的EIC时更准确。总之,Peak_CF 为药物(或草药,或功能性食品)代谢物的 LC-HRMS 数据集的数据挖掘提供了一条新途径。

更新日期:2022-07-21
down
wechat
bug