Suspect screening of environmental contaminants by UHPLC-HRMS and transposable Quantitative Structure-Retention Relationship modelling

https://doi.org/10.1016/j.jhazmat.2020.124652Get rights and content

Highlights

  • A Quantitative Structure-Retention Relationship model was built to improve suspect screening of environmental contaminants.

  • A novel approach using mobile phase composition at solute elution was introduced, making the model easily transposable.

  • We thoroughly validated the method and determined the degree of uncertainty of its predictions.

  • We showed the feasibility, efficiency and limitations of this approach through various application examples.

Abstract

A Quantitative Structure-Retention Relationship (QSRR) model is proposed and aims at increasing the confidence level associated to the identification of organic contaminants by Ultra-High Performance Liquid Chromatography hyphenated to High Resolution Mass Spectrometry (UHPLC-HRMS) in environmental samples under a suspect screening approach. The model was built from a selection of 8 easily accessible physicochemical descriptors, and was validated from a set of 274 organic compounds commonly found in environmental samples. The proposed predictive figure approach is based on the mobile phase composition at solute elution (expressed as % acetonitrile), that has the major advantage of making the model reusable by other laboratories, since the elution composition is independent of both the column geometry and the UHPLC-system. The model quality was assessed and was altered neither by the columns from different lots, nor by the complex matrices of environmental water samples. Then, the solute retention of any organic compound present in water samples is expected to be predicted within ± 14.3% acetonitrile by our model. Solute retention can therefore be used as a supplementary tool for the identification of environmental contaminants by UHPLC-HRMS, in addition to mass spectrometry data already used in the suspect screening approach.

Introduction

With the increasing concern for environmental issues in our modern world, the identification of a wide range of organic contaminants in environmental samples is now a crucial analytical challenge. Among all the available strategies to perform this characterization, suspect screening has been burgeoning over the past few years (Asghar et al., 2018, Bade et al., 2015a, Deeb et al., 2017). This approach consists of the identification of a variety of compounds expected to be present in a given sample, in most cases performed by Ultra-High Performance Liquid Chromatography coupled to High-Resolution Mass Spectrometry (UHPLC-HRMS). Analytical data provided by mass spectrometry are compared to theoretical mass, isotopic distribution and fragmentation data of suspected compounds – usually compiled in chemical databases – in order to suggest the presence of some of these contaminants in the sample. Retention data, in addition to mass spectrometry data, can be useful to discriminate two or more suspected contaminants having the same mass and similar isotopic profiles. The major drawback of solute retention as identification criterion is its high dependence on analytical conditions, which can be extremely different from one laboratory to another and even from one column to another. Unlike mass spectrometry data, it is therefore impossible to register in chemical databases a general retention time for each component. To address this issue, the scientific community is increasingly working on the prediction of chromatographic retention based on physical and chemical properties of compounds (Randazzo et al., 2016, Taraji et al., 2017a, Wen et al., 2018, Zhang et al., 2017). This approach, also called Quantitative Structure-Retention Relationship (QSRR), allows retention properties of suspected contaminants to be predicted under the specific conditions used by the analytical laboratory. Solute retention can therefore be efficiently used by this laboratory as a supplementary tool for contaminant identification by suspect screening approach.

Existing studies on QSRR models development show a wide diversity in terms of applicability areas (metabolomics (Cao et al., 2015, Creek et al., 2011, Gorynski et al., 2013), proteomics (Baczek et al., 2007), other omics-like strategies (Randazzo et al., 2016, Taraji et al., 2017a), pharmaceutical sector (Wen et al., 2018, Zhang et al., 2017, Hancock et al., 2005), environmental science ((Bade et al., 2015a, Aalizadeh et al., 2019)…), range of considered compounds and molecular descriptors used to predict the retention. In the field of environmental analytical chemistry, some authors mention very simplistic models only based on one descriptor which is often the logarithm of the water-octanol distribution coefficient (logP or logD for ionizable compounds) in Reversed Phase Liquid Chromatography (RPLC) (Bade et al., 2015b). Although they are quite easy to build and use, these models have the drawback of being based on only one descriptor that cannot describe, by its own, the whole complexity of the interactions occurring during the retention process. On the other hand, some studies offer QSRR models based on hundreds of molecular descriptors (Hancock et al., 2005, Taraji et al., 2017b). With such a large number of parameters these models can be laborious to apply in routine screening since descriptor values have to be found and collected for every suspected contaminant. Moreover, this type of models raises the question of the relevance of some descriptors since many of them imply computationally calculated complex molecular properties. In addition, some of them seem far away from the description of chromatographic phenomena in terms of physicochemical interactions. Some authors suggest a more convenient approach, based on the choice of a few descriptors, each of them offering a high relevance for the suspected compounds to be considered (Creek et al., 2011, Gorynski et al., 2013). Although they show some differences in the selection of molecular descriptors, almost all the existing studies agree on the choice of the retention time (tR) as the retention property to predict (Cao et al., 2015, Aalizadeh et al., 2019). However, the retention time of a given compound is highly influenced by a broad variety of experimental conditions: gradient conditions (composition range and gradient time), stationary phase, column geometry (length, internal diameter and particle size), and flow-rate. The geometry of the gradient system (dwell volume and extra-column volume) can also affect the retention time. As a result, any model developed to predict retention time is inextricably linked to a given set of analytical conditions. In most of the cases, due to the great diversity of chromatographic systems, columns, and analytical methods in the separation science community, such a model can only be used by the laboratory that has developed it. A recent research shows the raising awareness of the scientific community on the lack of transposability of QSRR models predicting retention time, by suggesting a novel approach called the Retention Time Interpolation Scaling strategy (Aalizadeh et al., 2019). Samples are spiked with well-defined analytical standards that will be used to apply a correction to the retention times of the compounds and therefore make these retention times more reliable.

The aim of our study is to use QSRR modelling to address two major challenges. The first goal is to considerably increase the confidence associated to the identification of suspected organic contaminants by UHPLC-HRMS using chromatographic data in addition to spectral information. In this perspective, a QSRR model was built and rigorously validated from 287 phytosanitary and pharmaceutical contaminants, using a C18 silica-based stationary phase. This latter is the most used stationary phase for suspect screening of environmental contaminants. To go further on this goal, the input of other stationary phases was studied as they could potentially offer a good complementarity with C18 stationary phase. The second challenge of these models is to be as replicable and reusable by other environmental analysis laboratories as possible. To do so, our selection of molecular descriptors differed from reported ones by the choice of a reasonable number of easily accessible and chromatographically relevant physicochemical properties. Finally, a novel approach considering the mobile phase composition at the time the solute is eluted as the retention property to predict instead of the usual retention time was introduced.

Section snippets

Chemicals

A list of 287 analytical standards was used for the development of the model (see Supporting Information, Table S1). This list contains pharmaceuticals and pesticides commonly found in environmental samples. A 95:5 ultra-pure water (LC-MS grade, obtained by a MilliQ LC-Pak Polisher)/acetonitrile (ULC/MS grade, purchased from Biosolve Chimie, Dieuze, France) solvent was spiked with these compounds at 20 µg/L for pesticides and 50 µg/L for pharmaceuticals. LC-MS grade formic acid purchased from

Selection of molecular descriptors

The most used molecular properties reported in the literature are listed in Table 1.

The first simplification of this selection was based on the facility to obtain descriptor values associated to the listed properties, which allowed the Molecular Volume and the Molecular Orbitals Energies to be removed.

The following properties were considered:

  • the logarithm of the water-octanol distribution coefficient at pH = 2.7 (logD) as a descriptor of water-octanol partition

  • the sum of the absolute value of

Conclusions

A QSRR model was developed and validated with 274 phytosanitary and pharmaceutical contaminants commonly found in environmental samples. A novel approach consisting in using the elution mobile phase composition as the retention property instead of the retention time was chosen. This approach allows to remove any effect of instrumental parameters such as the instrument dwell volume or the column dead volume on the model. In addition, it was shown that the predictive ability of our model was

Supporting information

Additional experimental procedures and data (PDF). Table S1: List of compounds used for the development and validation of the QSRR model; Table S2: Selection of molecular descriptors; Fig. S1: Plot of elution composition of compounds analyzed in HILIC mode against Acquity HSS T3.

Macro for model application (XLSM). Table S3.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors wish to thank the French National Office for Biodiversity (OFB) through the Aquaref program for fundings. We also thank Benjamin Renard for support in statistical modelling and Loïc Richard for the Excel Macro development allowing a simple use of the developed model.

References (61)

  • A.A. Deeb et al.

    Suspect screening of micropollutants and their transformation products in advanced wastewater treatment

    Sci. Total Environ.

    (2017)
  • T. Djaković-Sekulić et al.

    Evaluation of the predictive power of calculation procedure for molecular hydrophobicity of some estradiol derivates

    J. Chromatogr. B

    (2002)
  • I. Ferrer et al.

    Analysis of 100 pharmaceuticals and their degradates in water samples by liquid chromatography/quadrupole time-of-flight mass spectrometry

    J. Chromatogr. A

    (2012)
  • A. Golbraikh et al.

    Beware of q2!

    J. Mol. Gr. Model.

    (2002)
  • K. Gorynski et al.

    Quantitative structure-retention relationships models for prediction of high performance liquid chromatography retention time of small molecules: endogenous metabolites and banned compounds

    Anal. Chim. Acta

    (2013)
  • D. Guillarme et al.

    Method transfer for fast liquid chromatography in pharmaceutical analysis: application to short columns packed with small particle. Part II: gradient experiments

    Eur. J. Pharm. Biopharm.

    (2008)
  • T. Hancock et al.

    A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies

    Chemom. Intell. Lab. Syst.

    (2005)
  • B.G. Hansen et al.

    QSARs for KOW and KOC of PCB congeners: a critical examination of data, assumptions and statistical approaches

    Chemosphere

    (1999)
  • M. Ibanez et al.

    UHPLC-QTOF MS screening of pharmaceuticals and their metabolites in treated wastewater samples from Athens

    J. Hazard. Mater.

    (2017)
  • L. Kubik et al.

    Quantitative structure-(chromatographic) retention relationship models for dissociating compounds

    J. Pharm. Biomed. Anal.

    (2016)
  • D. Lambropoulou et al.

    Degradation of venlafaxine using TiO2/UV process: kinetic studies, RSM optimization, identification of transformation products and toxicity evaluation

    J. Hazard. Mater.

    (2017)
  • B. Lei et al.

    Novel approaches for retention time prediction of oligonucleotides in ion-pair reversed-phase high-performance liquid chromatography

    J. Chromatogr. A

    (2009)
  • A.D. McEachran et al.

    A comparison of three liquid chromatography (LC) retention time prediction models

    Talanta

    (2018)
  • C.B. Mollerup et al.

    Prediction of collision cross section and retention time for broad scope screening in gradient reversed-phase liquid chromatography-ion mobility-high resolution accurate mass spectrometry

    J. Chromatogr. A

    (2018)
  • K. Munro et al.

    Artificial neural network modelling of pharmaceutical residue retention times in wastewater extracts using gradient liquid chromatography-high resolution mass spectrometry data

    J. Chromatogr. A

    (2015)
  • G.M. Randazzo et al.

    Prediction of retention time in reversed-phase liquid chromatography as a tool for steroid identification

    Anal. Chim. Acta

    (2016)
  • Á. Sándi et al.

    Characterization of reversed-phase columns using the linear free energy relationship

    J. Chromatogr. A

    (2000)
  • A. Sangion et al.

    Hazard of pharmaceuticals for aquatic environment: prioritization by structural approaches and prediction of ecotoxicity

    Environ. Int.

    (2016)
  • R. Servien et al.

    TyPol – a new methodology for organic compounds clustering based on their molecular characteristics and environmental behavior

    Chemosphere

    (2014)
  • M. Taraji et al.

    Prediction of retention in hydrophilic interaction liquid chromatography using solute molecular descriptors based on chemical structures

    J. Chromatogr. A

    (2017)
  • Cited by (14)

    • Spatial-temporal occurrence of contaminants of emerging concern in urban rivers in southern Brazil

      2023, Chemosphere
      Citation Excerpt :

      Moreover, the toxicological evaluation is also very important, in order to evaluate the occurrence of adverse effects on animals and plants (El Azhari et al., 2018; Ferreira et al., 2018). With the increasing concern for environmental issues, the identification of a wide range of organic contaminants in environmental samples is now a crucial analytical challenge (Bride et al., 2021). Among all the available strategies to perform this characterization, suspect screening has become increasingly popular over the past few years (Bride et al., 2021; Ccanccapa-Cartagena et al., 2019; da Silva et al., 2021).

    View all citing articles on Scopus
    1

    ORCID: Eloi Bride: 0000-0002-0576-6211.

    2

    ORCID: Christelle Margoum: 0000-0002-7386-4454.

    View full text