Chemometrics, Comprehensive Two-Dimensional gas chromatography and “omics” sciences: Basic tools and recent applications

https://doi.org/10.1016/j.trac.2020.116111Get rights and content

Highlights

  • Chemometrics allows trend recognition that would not be discernible using manual assessment.

  • GC × GC data structure, pre-processing steps and algorithms restrictions are discussed.

  • Chemometrics strategies applied in omics fields and its misuse are evaluated.

Abstract

The advent of Comprehensive Two-dimensional Gas Chromatography (GC × GC) as a practical and accessible analytical tool had a considerable impact on analytical procedures associated to the so-called “omics” sciences. Specially when GC × GC is hyphenated to mass spectrometers or other multichannel detectors, in a single run it is possible to separate, detect and identify up to thousands of metabolites. However, the resulting data sets are exceedingly complex, and retrieving proper biochemical information from them demands powerful statistical tools to deal effectively with the massive amount of information generated by GC × GC. Nevertheless, the obtention of results valid on a chemical and biological standpoint depends on a deep understanding by the analyst of the fundamentals both of GC × GC and chemometrics. This review focuses on the basics of contemporary, fundamental chemometric tools applied to proccessing of GC × GC obtained from metabolomic, petroleomic and foodomic analyses. Here, we described the fundamentals of pattern recognition methods applied to GC × GC. Also, we explore how different detectors affect data structure and approaches for better data handling. Limitations regarding data structure and deviations from linearity are stressed for each algorithm, as well as their typical applications and expected output.

Introduction

Regardless of the line of research followed, the analyst should aim for an analysis that is quick, efficient and as clean as possible. The proper selection of techniques both for sample preparation and for the analysis of the extracts is a significant step to reduce overall time and cost. Comprehensive and deep knowledge about intermolecular interactions among all chemical species involved, as well as of the nature of the sample, is relevant for the optimization of these parameters [1]. Therefore, the selection of sensitive and compatible techniques is paramount in ensuring both adequate detectability and use of reduced amounts of sample for study. When the analysis is focused on the so-called “omics” sciences, the use of robust, high-sensitivity and high throughput techniques is not only desirable but mandatory. One of the omics, Metabolomics, is based on analytical strategies applied with the objective of identifying and in some cases quantifying changes in the endogenous metabolites of biological systems in response to factors that can be intrinsic and extrinsic [2]. Complete metabolomic strategies deal with both primary and secondary metabolites that have broad physical and chemical properties such as lipids, sugars, amino acids, etc. Thus, given the great complexity and variation in the classes of compounds involved, the metabolome becomes an even greater analytical challenge when compared to other omics approaches [3].

Consolidated techniques such as conventional liquid or gas chromatography, despite their high separation power and sensitivity, may not be sufficient for the resolution and identification of the compounds present in samples related to metabolomic analysis, or to provide data sets that are useful to retrieve the information relevant to the characterization of these samples. In this context, multidimensional chromatographic techniques and especially Comprehensive Two-dimensional Gas Chromatography (GC × GC) can be considered the most significant advance in the area of separations since the invention of capillary chromatography in 1958 [4,5]. The latter is based on systems fitted with two chromatographic columns coated with stationary phases of distinct separation mechanisms - the first and second dimension (1D and 2D) columns, connected in tandem by an interface called modulator [6]. The addition of a second dimension for separation results in an increase in the overall system peak capacity n corresponding to n1 × n2 (where n1 and n2 are the peak capacities of the 1D and 2D columns, respectively). This corresponds to a geometric increment, which is much higher than that possible in conventional GC using typical strategies to improve selectivity and efficiency – for example, a twofold increase in peak capacity in GC demands an impracticable increment in analysis time [4]. Therefore, extremely complex samples that were previously evaluated generically by observing bulk properties can now be studied in detail at the molecular level with the application of GC × GC: the composition of these samples can be understood in a simple and structured way. Another aspect regarding the application of GC × GC to “omics” sample sets relates to the data processing methods to be applied. In conventional chromatography, it is common to use as input for information retrieval data sets consisting of tables containing information about retention time, peak area and MS identification of the detected analytes. However, in GC × GC, the slicing of chromatographic bands during the modulation process and the improvement in the amount of detected peaks makes this process non-trivial. Its performance requires increased attention during the reconstitution of the multidimensional chromatographic space to avoid the association between peaks other than a single compound. In addition, considering the size of the information extracted from a single sample on a discrete GC × GC run, it is impossible to manually interpret hundreds of distinct chromatographic peaks within an acceptable time window. Therefore, the verification of classical chromatographic parameters such as peak area, resolution and sample fingerprinting is no longer a trivial step due to the structure of the data generated by GC × GC: the high density of data generated requires the use of data processing techniques in which the search for information is based on strategies that demand reduced intervention of the analyst - typically, multivariate chemometric tools.

The use of these chemometric techniques applied to GC × GC data allows the verification of trends or recognition of patterns that would not be discernible using conventional manual inspection and assessment. In this context, GC × GC users rely on chemometric techniques of data analysis to extract reliable and accurate sample information. In this paper, we intend to review the most popular chemometric tools used for pattern recognition in GC × GC, addressing also the application of these techniques to GC × GC analysis of “omics” sample sets, either for exploration of the systems or for sample classification.

Section snippets

Modulators

The heart of GC × GC systems – the interface that allows the creation of a second separation dimension - is the modulator. The first description of a modulation-based GC × GC setup was provided by Liu and Phillips in 1991 [7]: using a controlled cooling and heating scheme (thermal modulation), they were able to periodically collect and re-inject fractions of eluate from the 1D column in the 2D column. The period corresponding to these cycles of collection and reinjection of eluate from 1D

Preprocessing

Before chemometric processing, it may be necessary to reorganize the data matrixes generated by discrete GC × GC chromatograms; in addition, grouping of chromatograms for each processed sample or standard may be needed. These operations can further increase the size of the datasets, depending on the method chosen for adding tensors of different samples to the data structure. Grouping sample chromatograms into a single tensor also creates an additional dimension; on the other hand, this does not

Metabolomics

Due to the high separation capacity for complex mixtures, GC × GC associated with chemometrics has been widely used in metabolomic analysis in both target and non-target approaches [50,57]. The quality of the results obtained using the appropriate chemometric method contributes significantly to the explanatory analysis of complex biological systems in these applications [58]. Main approaches can vary, depending also on the nature of the biological system.

Future prospects

With the increasing number of applications of two-dimensional GC in omics sciences and the availability of robust, dependable commercial systems, its use is expected to grow outside the academic environment also. However, it is not possible to extract its full potential from the chromatographic technique without chemometric methods. Nowadays, several chemometric tools are commercially available; moreover, open access software packages are also increasingly available. However, to fully

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by grants provided by Fapesp - São Paulo Research Foundation [Grant 2014/50867-3], CNPq - National Council for Scientific and Technological Development [Grant 465389/2014-7] and CAPES - Brazilian Coordination for the Improvement of Higher Education Personnel [Finance Code 001]. Scholarships were also provided by CAPES for BJP, ICMC and CAT; by PIBIC-CNPq for CRV and by FAPESP for JRBS [Grant 2018/25729-7] and GVV [Grant 2019/12556-0]. Finally, authors thank Espaço da

References (100)

  • R. Gorovenko et al.

    On the use of quadrupole mass spectrometric detection for flow modulated comprehensive two-dimensional gas chromatography

    J. Chromatogr., A

    (2014)
  • L. Bai et al.

    Comparison of GC-VUV, GC-FID, and comprehensive two-dimensional GC–MS for the characterization of weathered and unweathered diesel fuels

    Fuel

    (2018)
  • Y. Zushi et al.

    Pixel-by-pixel correction of retention time shifts in chromatograms from comprehensive two-dimensional gas chromatography coupled to high resolution time-of-flight mass spectrometry

    J. Chromatogr., A

    (2017)
  • F. Savorani et al.

    A versatile tool for the rapid alignment of 1D NMR spectra

    J. Magn. Reson.

    (2010)
  • G. Tomasi et al.

    Icoshift: an effective tool for the alignment of chromatographic data

    J. Chromatogr., A

    (2011)
  • C. Quiroz-Moreno et al.

    RGCxGC toolbox: an R-package for data processing in comprehensive two-dimensional gas chromatography-mass spectrometry

    Microchem. J.

    (2020)
  • I. Lukić et al.

    Combined targeted and untargeted profiling of volatile aroma compounds with comprehensive two-dimensional gas chromatography for differentiation of virgin olive oils according to variety and geographical origin

    Food Chem.

    (2019)
  • G.L. Alexandrino et al.

    Investigating weathering in light diesel oils using comprehensive two-dimensional gas chromatography–High resolution mass spectrometry and pixel-based analysis: possibilities and limitations

    J. Chromatogr., A

    (2019)
  • L.M. Dubois et al.

    Thermal desorption comprehensive two-dimensional gas chromatography coupled to variable-energy electron ionization time-of-flight mass spectrometry for monitoring subtle changes in volatile organic compound profiles of human blood

    J. Chromatogr., A

    (2017)
  • S.D. Johanningsmeier et al.

    Metabolic footprinting of Lactobacillus buchneri strain LA1147 during anaerobic spoilage of fermented cucumbers

    Int. J. Food Microbiol.

    (2015)
  • S. Wold et al.

    PLS-regression: a basic tool of chemometrics

    Chemometr. Intell. Lab. Syst.

    (2001)
  • P. Geladi et al.

    Partial least-squares regression: a tutorial

    Anal. Chim. Acta

    (1986)
  • A.M. Jiménez-Carvelo et al.

    Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity – a review

    Food Res. Int.

    (2019)
  • Y. Izadmanesh et al.

    Chemometric analysis of comprehensive two dimensional gas chromatography–mass spectrometry metabolomics data

    J. Chromatogr., A

    (2017)
  • J. Jaumot et al.

    MCR-ALS GUI 2.0: new features and applications

    Chemometr. Intell. Lab. Syst.

    (2015)
  • G. Ahmadi et al.

    A systematic study on the accuracy of chemical quantitative analysis using soft modeling methods

    Chemometr. Intell. Lab. Syst.

    (2013)
  • J. Jaumot et al.

    A graphical user-friendly interface for MCR-ALS: a new tool for multivariate curve resolution in MATLAB

    Chemometr. Intell. Lab. Syst.

    (2005)
  • P.Q. Tranchida et al.

    Current state of comprehensive two-dimensional gas chromatography-mass spectrometry with focus on processes of ionization

    TrAC Trends Anal. Chem. (Reference Ed.)

    (2018)
  • L.L.P. van Stee et al.

    Peak detection methods for GCxGC: an overview

    TrAC Trends Anal. Chem. (Reference Ed.)

    (2016)
  • J.R. Belinato et al.

    Opportunities for green microextractions in comprehensive two-dimensional gas chromatography/mass spectrometry-based metabolomics – a review

    Anal. Chim. Acta

    (2018)
  • S.L. Andersen et al.

    Metabolome-based signature of disease pathology in MS

    Mult. Scler. Relat. Disord.

    (2019)
  • T. Miyazaki et al.

    Two-dimensional gas chromatography time-of-flight mass spectrometry-based serum metabolic fingerprints of neonatal calves before and after first colostrum ingestion

    J. Dairy Sci.

    (2017)
  • N. Koen et al.

    Metabolomics of colistin methanesulfonate treated Mycobacterium tuberculosis

    Tuberculosis

    (2018)
  • D.T. Loots et al.

    A metabolomics investigation of the function of the ESX-1 gene cluster in mycobacteria

    Microb. Pathog.

    (2016)
  • J.R.B. de Souza et al.

    In vivo investigation of the volatile metabolome of antiphytopathogenic yeast strains active against Penicillium digitatum using comprehensive two-dimensional gas chromatography and multivariate data analysis

    Microchem. J.

    (2018)
  • G. Vanini et al.

    Analytical advanced techniques in the molecular-level characterization of Brazilian crude oils

    Microchem. J.

    (2018)
  • D.M. Coutinho et al.

    Rapid hydrocarbon group-type semi-quantification in crude oils by comprehensive two-dimensional gas chromatography

    Fuel

    (2018)
  • D. França et al.

    Speciation and quantification of high molecular weight paraffins in Brazilian whole crude oils using high-temperature comprehensive two-dimensional gas chromatography

    Fuel

    (2018)
  • B.Q. Araújo et al.

    Occurrence of extended tetracyclic polyprenoid series in crude oils

    Org. Geochem.

    (2018)
  • P.S. Prata et al.

    Discriminating Brazilian crude oils using comprehensive two-dimensional gas chromatography-mass spectrometry and multiway principal component analysis

    J. Chromatogr., A

    (2016)
  • D.L. Vale et al.

    Comprehensive and multidimensional tools for crude oil property prediction and petrochemical industry refinery inferences

    Fuel

    (2018)
  • S. Carlin et al.

    Regional features of northern Italian sparkling wines, identified using solid-phase micro extraction and comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry

    Food Chem.

    (2016)
  • P.H. Stefanuto et al.

    Advanced method optimization for volatile aroma profiling of beer using two-dimensional gas chromatography time-of-flight mass spectrometry

    J. Chromatogr., A

    (2017)
  • E.M. Humston et al.

    Quantitative assessment of moisture damage for cacao bean quality using two-dimensional gas chromatography combined with time-of-flight mass spectrometry and chemometrics

    J. Chromatogr., A

    (2010)
  • D.J. Beale et al.

    Review of Recent Developments in GC–MS Approaches to Metabolomics-Based Research

    (2018)
  • L. Mondello

    Comprehensive Chromatography in Combination with Mass Spectrometry

    (2011)
  • M.J.E. Golay

    Vapor phase chromatography and the telegrapher’s equation

    Anal. Chem.

    (1957)
  • M.S.S. Amaral et al.

    Comprehensive two-dimensional gas chromatography advances in technology and applications: biennial update

    Anal. Chem.

    (2020)
  • Z. Liu et al.

    Comprehensive two-dimensional gas chromatography using an on-column thermal modulator interface

    J. Chromatogr. Sci.

    (1991)
  • S.E. Prebihalo et al.

    Multidimensional gas chromatography: advances in instrumentation, chemometrics, and applications

    Anal. Chem.

    (2018)
  • Cited by (50)

    View all citing articles on Scopus
    View full text