Mass spectral reconstruction of LC/MS data with entropy minimization

https://doi.org/10.1016/j.ijms.2020.116359Get rights and content

Highlights

  • A novel deconvolution approach for LC-MS data.

  • Resolve asymmetric, co-eluted and trace peaks in LC-MS data.

  • A multivariate curve resolution approach that does not require any parameter settings.

Abstract

Liquid chromatography-mass spectrometry (LC/MS) is a major technique for the analysis of complex samples. Data processing for LC/MS untargeted analysis is very challenging due to the presence of asymmetric, co-eluted at trace-level peaks. We report a multivariate curve resolution approach based on entropy minimization to resolve LC/MS data. Our liquid chromatography band-target entropy minimization (lc-BTEM) approach enables automated resolution of complex LC/MS peaks without the requirement of any parameter inputs. We demonstrated the performance of our approach on a co-eluted and trace-level peaks in an American ginseng extract. Consequently, we were able to putatively annotate the presence of more than 50 ginsenosides. The extracted pure mass spectra and in-source fragments information further enabled the annotation of [M+H]+, [M+H-Neu]+ and [M-H]- ions. This approach is expected to benefit LC/MS untargeted analysis and data independent analysis of complex samples.

Introduction

Liquid chromatography-mass spectrometry (LC/MS) is a widely used technique in biochemical and biomedical research [[1], [2], [3], [4], [5]]. The technique offers a combination of sensitivity and selectivity as well as enables the analysis of a wide range of non-volatile compounds. However, LC/MS produces complex data and demands systematic, stringent and innovative data processing steps to extract meaningful information. The major challenges include resolving asymmetric, co-eluted and trace-level peaks. Current solutions to this challenge rely mainly on the optimization of experimental protocols. However, this can be very time-consuming and may not always resolve the problem, especially for complex biological samples.

Spectral deconvolution is a viable solution to resolve co-eluted and trace-level peaks in LC/MS data. Numerous empirical peak shape modeling techniques have been introduced over the years, such as the bi-Gaussian model and exponentially modified Gaussian model for this purpose [[6], [7], [8], [9]]. Other approaches apply feature selection algorithms (e.g., CODA, CAMERA, centWave) and its variants [[10], [11], [12], [13]]. However, a one-size-fit-all modeling solution may not be available since the peak shape and width in LC/MS data can vary from one equipment to another or even across repeated data from the same equipment. The analysis often requires fine adjustments of various parameter settings to ensure accurate deconvolution and is thus not easily accessible to non-experts.

In recent years, information entropy minimization approach has been implemented as a deconvolution technique [14,15]. As a blind source separation algorithm, the family of band-target entropy minimization (BTEM) algorithms can extract pure signals from mixture signals by finding the linear combination of pure components and without the need for any a priori information [[16], [17], [18], [19]]. In comparison to empirical peak shape modeling techniques, BTEM approach does not require any parameter settings and is robust in handling different forms of peak shape and width. Recently, rBTEM was introduced for the deconvolution of GC/MS data and the result showed that it was able to increase the sensitivity and accuracy for analysis of co-eluting and trace-level components in comparison to empirical-based approach [20].

In this work, we extend the information entropy minimization approach to deconvolute co-eluted and trace-level peaks in LC/MS scan data. The rBTEM variant herein, known as lc-BTEM, considers the higher resolution of data generated by a triple quadrupole system to enable the extraction of pure mass spectra without the need for any parameter settings. We demonstrate its application on LC/MS scan data of American ginseng (Panax quinquefolius) extract. The scan data contains valuable information related to molecular ions and their fragment ions resulting from in-source collisional activation dissociation at the ESI interface, which enables the application of the lc-BTEM algorithm which is based on finding the linear combination of pure components in a mixture system. The availability of pure mass spectra and inherent in-source fragments information enabled a seamless recognition of [M+H]+, [M+H-Neu]+ and [M-H]- ions and subsequently led to the putative annotation of ginsenoside subtypes for more than 50 ginsenosides.

Section snippets

Material and instrumentations

Methanol (LCMS grade), acetonitrile (LCMS grade) and formic acid (98–100%, analytical grade) were purchased from Sigma Aldirch, Merck, Singapore. American ginseng (Panax quinquefolius) which was harvested from Canada and three years old, as well as Cordyceps militaris were purchased from Hockhua Tonic, Singapore.

LC/MS analyses were performed on the Shimadzu LCMS-8040 and Agilent G6550 QTOF LC/MS. Sonic dismembrator (Fisherbrand Model 505) was used for ultra-sonicating the samples.

Sample preparation

Panax

Results and discussion

Full-scan positive- and negative-ion mode LC/MS analysis were performed on the ginseng extract. The positive-ion mode TIC is shown in Fig. 1, with the inset indicating two regions of interests for analysis, specifically region A and B. Region A consists of a trace-level peak while region B contains a subtle overlapping peak which can be easily mistaken as a pure component peak. These regions were systematically analyzed with lc-BTEM to extract the pure component spectra for putative annotation.

Conclusion

In this work, a spectral deconvolution approach for LC/MS data is introduced for the analysis of co-eluting and trace-level components in full-scan mode. The extracted pure component spectra provide crucial information of clean in-source fragmentation patterns and characteristics fragment ions which are especially useful for a quick survey on the total amount and subtypes of ginsenosides present in the ginseng extract of this work.

More importantly, this study demonstrated the capability of

CRediT authorship contribution statement

Hua Jun Zhang: Supervision, Conceptualization, Software. Yunbo Lv: Conceptualization, Data curation, Project administration, Writing - original draft. Chun Kiang Chua: Conceptualization, Visualization, Writing - original draft. Tai Guo: Software. Zhe Sun: Writing - review & editing. Zhaoqi Zhan: Supervision, Writing - review & editing.

Declaration of competing interest

The SmartDalton software is from ChemoPower Technology Pte Ltd in Singapore. Hua Jun Zhang, Yunbo Lv, Chun Kiang Chua and Tai Guo worked for ChemoPower Technology Pte Ltd at the time of submission.

Acknowledgment

We would like to thank Ms. Li Limin for preparing the ginseng extract.

References (37)

  • Y. Chen et al.

    Determination of ginsenosides in Asian and American ginsengs by liquid chromatography–quadrupole/time-of-flight MS: assessing variations based on morphological characteristics

    J. Ginseng Res.

    (2017)
  • J. Liu et al.

    The integration of GC–MS and LC–MS to assay the metabolomics profiling in Panax ginseng and Panax quinquefolius reveals a tissue-and species-specific connectivity of primary metabolites and ginsenosides accumulation

    J. Pharmaceut. Biomed. Anal.

    (2017)
  • J. Zhang et al.

    A metabolomics approach for authentication of Ophiocordyceps sinensis by liquid chromatography coupled with quadrupole time-of-flight mass spectrometry

    Food Res. Int.

    (2015)
  • J. Fenn et al.

    Electrospray ionization for mass spectrometry of large biomolecules

    Science

    (1989)
  • S. Banerjee et al.

    Electrospray ionization mass spectrometry: a technique to access the information beyond the molecular weight of the analyte

    Int. J. Anal. Chem.

    (2012)
  • L.W. Sumner et al.

    Proposed quantitative and alphanumeric metabolite identification metrics

    Metabolomics

    (2014)
  • T. Yu et al.

    apLCMS—adaptive processing of high-resolution LC/MS data

    Bioinformatics

    (2009)
  • T. Pluskal et al.

    MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data

    BMC Bioinf.

    (2010)
  • Cited by (3)

    • An authentic assessment method for cordyceps sinensis

      2024, Journal of Pharmaceutical and Biomedical Analysis
    • Profiling complex volatile components by HS-GC-MS and entropy minimization software: An example on Ligusticum chuanxiong Hort.

      2022, Journal of Pharmaceutical and Biomedical Analysis
      Citation Excerpt :

      For such cases, mathematical algorithms may provide possible solutions for many problems that cannot be solved by experiments in the field of chemical analysis. To obtain pure mass spectra, mathematical deconvolution of mass spectral data is proven a more practical solution [5–7]. The empirical-based solution, as one class of the mathematical algorithms, was an early strategy for mass spectral deconvolution.

    View full text