Suspect screening of environmental contaminants by UHPLC-HRMS and transposable Quantitative Structure-Retention Relationship modelling
Graphical Abstract
Introduction
With the increasing concern for environmental issues in our modern world, the identification of a wide range of organic contaminants in environmental samples is now a crucial analytical challenge. Among all the available strategies to perform this characterization, suspect screening has been burgeoning over the past few years (Asghar et al., 2018, Bade et al., 2015a, Deeb et al., 2017). This approach consists of the identification of a variety of compounds expected to be present in a given sample, in most cases performed by Ultra-High Performance Liquid Chromatography coupled to High-Resolution Mass Spectrometry (UHPLC-HRMS). Analytical data provided by mass spectrometry are compared to theoretical mass, isotopic distribution and fragmentation data of suspected compounds – usually compiled in chemical databases – in order to suggest the presence of some of these contaminants in the sample. Retention data, in addition to mass spectrometry data, can be useful to discriminate two or more suspected contaminants having the same mass and similar isotopic profiles. The major drawback of solute retention as identification criterion is its high dependence on analytical conditions, which can be extremely different from one laboratory to another and even from one column to another. Unlike mass spectrometry data, it is therefore impossible to register in chemical databases a general retention time for each component. To address this issue, the scientific community is increasingly working on the prediction of chromatographic retention based on physical and chemical properties of compounds (Randazzo et al., 2016, Taraji et al., 2017a, Wen et al., 2018, Zhang et al., 2017). This approach, also called Quantitative Structure-Retention Relationship (QSRR), allows retention properties of suspected contaminants to be predicted under the specific conditions used by the analytical laboratory. Solute retention can therefore be efficiently used by this laboratory as a supplementary tool for contaminant identification by suspect screening approach.
Existing studies on QSRR models development show a wide diversity in terms of applicability areas (metabolomics (Cao et al., 2015, Creek et al., 2011, Gorynski et al., 2013), proteomics (Baczek et al., 2007), other omics-like strategies (Randazzo et al., 2016, Taraji et al., 2017a), pharmaceutical sector (Wen et al., 2018, Zhang et al., 2017, Hancock et al., 2005), environmental science ((Bade et al., 2015a, Aalizadeh et al., 2019)…), range of considered compounds and molecular descriptors used to predict the retention. In the field of environmental analytical chemistry, some authors mention very simplistic models only based on one descriptor which is often the logarithm of the water-octanol distribution coefficient (logP or logD for ionizable compounds) in Reversed Phase Liquid Chromatography (RPLC) (Bade et al., 2015b). Although they are quite easy to build and use, these models have the drawback of being based on only one descriptor that cannot describe, by its own, the whole complexity of the interactions occurring during the retention process. On the other hand, some studies offer QSRR models based on hundreds of molecular descriptors (Hancock et al., 2005, Taraji et al., 2017b). With such a large number of parameters these models can be laborious to apply in routine screening since descriptor values have to be found and collected for every suspected contaminant. Moreover, this type of models raises the question of the relevance of some descriptors since many of them imply computationally calculated complex molecular properties. In addition, some of them seem far away from the description of chromatographic phenomena in terms of physicochemical interactions. Some authors suggest a more convenient approach, based on the choice of a few descriptors, each of them offering a high relevance for the suspected compounds to be considered (Creek et al., 2011, Gorynski et al., 2013). Although they show some differences in the selection of molecular descriptors, almost all the existing studies agree on the choice of the retention time (tR) as the retention property to predict (Cao et al., 2015, Aalizadeh et al., 2019). However, the retention time of a given compound is highly influenced by a broad variety of experimental conditions: gradient conditions (composition range and gradient time), stationary phase, column geometry (length, internal diameter and particle size), and flow-rate. The geometry of the gradient system (dwell volume and extra-column volume) can also affect the retention time. As a result, any model developed to predict retention time is inextricably linked to a given set of analytical conditions. In most of the cases, due to the great diversity of chromatographic systems, columns, and analytical methods in the separation science community, such a model can only be used by the laboratory that has developed it. A recent research shows the raising awareness of the scientific community on the lack of transposability of QSRR models predicting retention time, by suggesting a novel approach called the Retention Time Interpolation Scaling strategy (Aalizadeh et al., 2019). Samples are spiked with well-defined analytical standards that will be used to apply a correction to the retention times of the compounds and therefore make these retention times more reliable.
The aim of our study is to use QSRR modelling to address two major challenges. The first goal is to considerably increase the confidence associated to the identification of suspected organic contaminants by UHPLC-HRMS using chromatographic data in addition to spectral information. In this perspective, a QSRR model was built and rigorously validated from 287 phytosanitary and pharmaceutical contaminants, using a C18 silica-based stationary phase. This latter is the most used stationary phase for suspect screening of environmental contaminants. To go further on this goal, the input of other stationary phases was studied as they could potentially offer a good complementarity with C18 stationary phase. The second challenge of these models is to be as replicable and reusable by other environmental analysis laboratories as possible. To do so, our selection of molecular descriptors differed from reported ones by the choice of a reasonable number of easily accessible and chromatographically relevant physicochemical properties. Finally, a novel approach considering the mobile phase composition at the time the solute is eluted as the retention property to predict instead of the usual retention time was introduced.
Section snippets
Chemicals
A list of 287 analytical standards was used for the development of the model (see Supporting Information, Table S1). This list contains pharmaceuticals and pesticides commonly found in environmental samples. A 95:5 ultra-pure water (LC-MS grade, obtained by a MilliQ LC-Pak Polisher)/acetonitrile (ULC/MS grade, purchased from Biosolve Chimie, Dieuze, France) solvent was spiked with these compounds at 20 µg/L for pesticides and 50 µg/L for pharmaceuticals. LC-MS grade formic acid purchased from
Selection of molecular descriptors
The most used molecular properties reported in the literature are listed in Table 1.
The first simplification of this selection was based on the facility to obtain descriptor values associated to the listed properties, which allowed the Molecular Volume and the Molecular Orbitals Energies to be removed.
The following properties were considered:
- –
the logarithm of the water-octanol distribution coefficient at pH = 2.7 (logD) as a descriptor of water-octanol partition
- –
the sum of the absolute value of
Conclusions
A QSRR model was developed and validated with 274 phytosanitary and pharmaceutical contaminants commonly found in environmental samples. A novel approach consisting in using the elution mobile phase composition as the retention property instead of the retention time was chosen. This approach allows to remove any effect of instrumental parameters such as the instrument dwell volume or the column dead volume on the model. In addition, it was shown that the predictive ability of our model was
Supporting information
Additional experimental procedures and data (PDF). Table S1: List of compounds used for the development and validation of the QSRR model; Table S2: Selection of molecular descriptors; Fig. S1: Plot of elution composition of compounds analyzed in HILIC mode against Acquity HSS T3.
Macro for model application (XLSM). Table S3.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors wish to thank the French National Office for Biodiversity (OFB) through the Aquaref program for fundings. We also thank Benjamin Renard for support in statistical modelling and Loïc Richard for the Excel Macro development allowing a simple use of the developed model.
References (61)
- et al.
Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants
J. Hazard. Mater.
(2019) - et al.
Suspect screening and target quantification of human pharmaceutical residues in the surface water of Wuhan, China, using UHPLC-Q-Orbitrap HRMS
Sci. Total Environ.
(2018) - et al.
Suspect screening of large numbers of emerging contaminants in environmental waters using artificial neural networks for chromatographic retention time prediction and high resolution mass spectrometry data analysis
Sci. Total Environ.
(2015) - et al.
Critical evaluation of a simple retention time predictor based on LogKow as a complementary tool in the identification of emerging contaminants in water
Talanta
(2015) - et al.
Gradient liquid chromatographic retention time prediction for suspect screening applications: a critical assessment of a generalised artificial neural network-based approach across 10 multi-residue reversed-phase analytical methods
Talanta
(2016) - et al.
Categorizing chlordecone potential degradation products to explore their environmental fate
Sci. Total Environ.
(2017) - et al.
The use of LC predicted retention times to extend metabolites identification with SWATH data acquisition
J. Chromatogr. B
(2017) - et al.
Development of a Retention Time Interpolation scale (RTi) for liquid chromatography coupled to mass spectrometry in both positive and negative ionization modes
J. Chromatogr. A
(2018) - et al.
Nontarget screening using passive air and water sampling with a level II fugacity model to identify unregulated environmental contaminants
J. Environ. Sci.
(2017) - et al.
Modelling of UPLC behaviour of acylcarnitines by quantitative structure-retention relationships
J. Pharm. Biomed. Anal.
(2014)
Suspect screening of micropollutants and their transformation products in advanced wastewater treatment
Sci. Total Environ.
Evaluation of the predictive power of calculation procedure for molecular hydrophobicity of some estradiol derivates
J. Chromatogr. B
Analysis of 100 pharmaceuticals and their degradates in water samples by liquid chromatography/quadrupole time-of-flight mass spectrometry
J. Chromatogr. A
Beware of q2!
J. Mol. Gr. Model.
Quantitative structure-retention relationships models for prediction of high performance liquid chromatography retention time of small molecules: endogenous metabolites and banned compounds
Anal. Chim. Acta
Method transfer for fast liquid chromatography in pharmaceutical analysis: application to short columns packed with small particle. Part II: gradient experiments
Eur. J. Pharm. Biopharm.
A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies
Chemom. Intell. Lab. Syst.
QSARs for KOW and KOC of PCB congeners: a critical examination of data, assumptions and statistical approaches
Chemosphere
UHPLC-QTOF MS screening of pharmaceuticals and their metabolites in treated wastewater samples from Athens
J. Hazard. Mater.
Quantitative structure-(chromatographic) retention relationship models for dissociating compounds
J. Pharm. Biomed. Anal.
Degradation of venlafaxine using TiO2/UV process: kinetic studies, RSM optimization, identification of transformation products and toxicity evaluation
J. Hazard. Mater.
Novel approaches for retention time prediction of oligonucleotides in ion-pair reversed-phase high-performance liquid chromatography
J. Chromatogr. A
A comparison of three liquid chromatography (LC) retention time prediction models
Talanta
Prediction of collision cross section and retention time for broad scope screening in gradient reversed-phase liquid chromatography-ion mobility-high resolution accurate mass spectrometry
J. Chromatogr. A
Artificial neural network modelling of pharmaceutical residue retention times in wastewater extracts using gradient liquid chromatography-high resolution mass spectrometry data
J. Chromatogr. A
Prediction of retention time in reversed-phase liquid chromatography as a tool for steroid identification
Anal. Chim. Acta
Characterization of reversed-phase columns using the linear free energy relationship
J. Chromatogr. A
Hazard of pharmaceuticals for aquatic environment: prioritization by structural approaches and prediction of ecotoxicity
Environ. Int.
TyPol – a new methodology for organic compounds clustering based on their molecular characteristics and environmental behavior
Chemosphere
Prediction of retention in hydrophilic interaction liquid chromatography using solute molecular descriptors based on chemical structures
J. Chromatogr. A
Cited by (14)
Preparation and application of UPLC silica microsphere stationary phase:A review
2024, Advances in Colloid and Interface ScienceQuantitative structure-retention relationship by databases of illegal additives
2023, Journal of Food Composition and AnalysisIdentification of the organic compounds in surface water: Suspect screening using liquid chromatography high-resolution mass spectrometry and in silico toxicity evaluation
2023, International Journal of Mass SpectrometrySpatial-temporal occurrence of contaminants of emerging concern in urban rivers in southern Brazil
2023, ChemosphereCitation Excerpt :Moreover, the toxicological evaluation is also very important, in order to evaluate the occurrence of adverse effects on animals and plants (El Azhari et al., 2018; Ferreira et al., 2018). With the increasing concern for environmental issues, the identification of a wide range of organic contaminants in environmental samples is now a crucial analytical challenge (Bride et al., 2021). Among all the available strategies to perform this characterization, suspect screening has become increasingly popular over the past few years (Bride et al., 2021; Ccanccapa-Cartagena et al., 2019; da Silva et al., 2021).