Enhanced database creation with in silico workflows for suspect screening of unknown tebuconazole transformation products in environmental samples by UHPLC-HRMS

https://doi.org/10.1016/j.jhazmat.2022.129706Get rights and content

Highlights

  • A suspect database of 291 TPs of tebuconazole was created.

  • Twelve cutting-edge in silico predictors were used and compared.

  • RT and a priori fragmentation predictions were conducted on predicted TPs.

  • Comparison of predictions from transformation predictors revealed the known TPs.

  • Workflow-aided retrospective analysis of surface-water samples highlighted new TPs.

Abstract

The search and identification of organic contaminants in agricultural watersheds has become a crucial effort to better characterize watershed contamination by pesticides. The past decade has brought a more holistic view of watershed contamination via the deployment of powerful analytical strategies such as non-target and suspect screening analysis that can search more contaminants and their transformation products. However, suspect screening analysis remains broadly confined to known molecules, primarily due to the lack of analytical standards and suspect databases for unknowns such as pesticide transformation products. Here we developed a novel workflow by cross-comparing the results of various in silico prediction tools against literature data to create an enhanced database for suspect screening of pesticide transformation products. This workflow was applied on tebuconazole, used here as a model pesticide, and resulted in a suspect screening database counting 291 transformation products. The chromatographic retention times and tandem mass spectra were predicted for each of these compounds using 6 models based on multilinear regression and more complex machine-learning algorithms. This comprehensive approach to the investigation and identification of tebuconazole transformation products was retrospectively applied on environmental samples and found 6 transformation products identified for the first time in river water samples.

Introduction

Pesticides are chemical compounds used mainly in agriculture to control plant pests and improve crop yields. Once in the environment, pesticides can be degraded into transformation products (TPs) via both biotic and abiotic transformation processes (Fenner et al., 2013, Escher and Fenner, 2011). The chemical compounds formed by these transformations processes are generally lower, more persistent in the environment and more mobile than the parent compound, which can increase their transport to surface water and groundwater by runoff or seepage from agricultural soils (Boxall et al., 2004, Postigo and Barceló, 2015). As a rule, these structural and property changes do not specifically increase the toxicity of TPs compared to parent compounds. However, within the multitude of products formed, some may be exceptions to this rule, which makes it important to identify them (Escher and Fenner, 2011). This blind-spot in identification means that the toxicity of pesticides and their TPs in water bodies is globally underestimated (Mahler et al., 2021, Moschet et al., 2014). Novel approaches are needed in order to identify these unknown TPs compounds.

The simultaneous quantification of pesticides and their known TPs in waterbodies has revealed the presence of TPs at higher levels of concentration and occurrence than their parent compounds. As an example, in headwater streams, Le Cor et al (Le Cor et al., 2021). highlighted that pesticide TPs accounted for more than half of the substances detected and that TP concentrations were often ten times higher than the parent-compound concentrations (0.46 ± 0.02 μg/L for the TP metazachlor-ESA versus 0.047 ± 0.007 μg/L for the parent metazachlor). However, such targeted analyses are limited by the lack of standards for most pesticide TPs. To overcome this gap, powerful techniques such as high-resolution mass spectrometry (HRMS) have been developed over the last decade. Gas chromatography (GC) or liquid chromatography (LC) coupled with HRMS can serve to develop suspect and non-target screening (NTS) strategies that bring a more holistic understanding of the environmental fate of organic chemicals by untangling the unknowns (Escher et al., 2020).

Suspect screening strategies involve comparing key characteristics of compounds, compiled in a database (DB), to analytical data on actual environmental samples acquired by HRMS. The minimum data required to suspect a compound in a water sample is the exact mass of the compounds of interest. Levels of confidence in suspected presence can be increased with additional compound-related data such as mass fragmentation patterns (MS/MS spectra) and chromatographic retention times (RT) (Schymanski et al., 2014). This additional data is usually obtained by injecting analytical standards into a LC or GC-HRMS instrument or is already contained in commercial or public databases, such as the NORMAN Suspect List Exchange (https://www.norman-network.com/nds/SLE/). However, when analytical standards and databases are unavailable, analysts should consider using extensive suspect screening with enhanced databases built from in silico prediction tools. Recent developments in extensive suspect screening for pesticide TPs within water bodies has made it possible to identify many new focal compounds (Fonseca et al., 2019, Kiefer et al., 2019), which underscores the value of creating improved databases for suspect screening analysis.

In silico tools are defined here as commercially or freely-available software or web platforms that use sophisticated algorithms to perform predictive tasks that would be too time-consuming or even impossible for a human to perform. The practicality of such in silico tools stems from their ability to predict compound properties solely from their chemical identifiers—as with the simplified molecular-input line-entry specification; SMILES—, thus overcoming the need for analytical standards.

Some in silico tools, called transformation predictors, can predict the formation of possible TPs by using the chemical identifiers of the parent compound as an input. These tools are based on various pre-established physicochemical reactions that can occur in various environmental compartments (e.g. aquatic, terrestrial or biological) via both abiotic and biotic transformation processes on scales running from microbial up to mammalian metabolism. The appropriate transformation predictor has to be selected based on the environmental degradation processes investigated. TPs predicted by these transformation predictors carry a relatively high rate of false-positives, but some predictors can use relative reasoning to address this issue (Bletsou et al., 2015). The efficiency of these tools has already been proven. For instance, Jiao et al (Jiao et al., 2022). recently detected 14 new TPs of the fungicide pyrisoxazole using literature data and one in silico tool, Envipath (Wicker et al., 2016), for database construction.

Another important subset of in silico tools are chromatographic RT prediction tools, which are usually based on quantitative structure–activity relationship (QSAR) models principles, extended to so-called quantitative structure–retention relationship (QSRR) models. Predictions are made based on the assumption that there are relationships between the chemical structures of the compounds and their chromatographic RTs. These prediction tools are developed from predicted or experimental molecular descriptors—which are associated with experimental chromatographic RTs—of a group of compounds. This group is generally split into two: one called the “training set” that establishes the relationship between molecular descriptors and chromatographic RT, and the other called the “testing set” that is used for validation. This group can also be divided into three, with an addition to the training and testing set of a “validation set”, which deals with any overfitting produced during the QSRR construction (Amos et al., 2018). The complexity of these QSRR models varies according to the amount and type of molecular descriptors required to build them, but also depending on the algorithms establishing the relationships, from multiple linear regression (MLR) to non-linear machine-learning (ML)-based QSRR. Taking into account the range of prediction error given by the QSRR model, the predicted chromatographic RTs can serve to eliminate outliers during suspect screening (Aalizadeh et al., 2019).

Other in silico tools can be used to annotate acquired MS/MS spectra a posteriori, such as SIRIUS (Dührkop et al., 2019), MAGMA (Ridder et al., 2012) or MetFrag (Ruttkies et al., 2016), in order to identify compounds or at least increase their confidence in detection during suspect and non-target analysis (Kiefer et al., 2019, Eysseric et al., 2021). A complementary approach consists of predicting MS/MS spectra before analytical acquisition (i.e. a priori) in order to enhance the suspect compounds database. This can be done with fragmentation predictors like competitive fragmentation modeling-ID (CFM-ID) that employ neural network algorithms for a priori prediction of MS/MS spectra based solely on SMILES compounds as an input (Djoumbou-Feunang et al., 2019, Chao et al., 2020). This addition of predicted MS/MS spectra strengthens the identification performance and limits compound mismatches during suspect screening analysis.

With that vision, a solution to better characterize water-body contamination by pesticide TPs could be to combine a selected set of these in silico tools, which are often used alone but, to our knowledge, have never been grouped into a comprehensive workflow. Here we address this gap by developing a comprehensive workflow for the creation of detailed databases for suspect screening of unknown compounds such as pesticide TPs in agricultural watersheds. Each step of this workflow allows the prediction of specific information about the TP compounds, such as their identity, chromatographic RT, and fragmentation spectra. The novelty of this approach is that it uses several in silico prediction tools based on innovative algorithms and cross-compares them together and against literature data. In addition to being easily transferable to other compounds or analytical conditions, this approach provides an enhanced ready-to-use database of a pesticide’s TPs for suspect screening analysis on environmental samples.

Section snippets

Pesticide selection

To demonstrate the potential of using a combination of in silico tools to create a suspect screening database of TPs, the triazole fungicide tebuconazole (TBZ) was used as a model compound. The main characteristics of this compound are presented in Table 1.

TBZ was selected primarily because it is one of the best-selling fungicides in the world and it has been applied for over twenty years in Europe due to its broad-spectrum activity (Cabras et al., 1997, Li et al., 2019). Moreover, the

Comparison of in silico and in biblio predictions for transformation products

The six transformation predictors used were able to predict 215 distinct TPs for TBZ. Literature search yielded 97 TPs, predominantly from the work of Storck et al (Storck et al., 2016)., and El Azhari et al. (2018) that included previous experimental studies on TBZ degradation. The full database of TBZ TPs created at this workflow step can be consulted at the following address: https://doi.org/10.57745/Y3JLTV //entrepot.recherche.data.gouv.fr/privateurl.xhtml?token=d4530060-0c3d-4ebe-ad14-38be413c7d62

Conclusions

This study proposed a comprehensive workflow for the implementation of detailed and ready-to-use databases to support suspect screening analyses of unknown compounds in agricultural watersheds. This novel workflow, combining several in silico tools, was applied on tebuconazole. It allowed the creation of a database of 291 tebuconazole transformation products, incremented with their predicted chromatographic retention times and fragment patterns.

The six transformation predictors allowed to

Environmental implication

The in silico workflow presented in our work represents an improvement in the suspect screening of transformation products, which are undeniable ubiquitous environmentally hazardous contaminants. Applied on the fungicide tebuconazole as a model compound, the workflow led to the detection of seven new transformation products in surface waters. Based on accessible and transposable in silico tools, the proposed workflow can be replicated to a wide range of organic substances and reused by other

CRediT authorship contribution statement

Kevin Rocco: Conceptualization, Investigation, Methodology, Formal analysis, Software, Writing – original draft. Christelle Margoum: Conceptualization, Methodology, Funding acquisition, Project administration, Writing – review & editing. Loïc Richard: Investigation, Methodology, Formal analysis, Writing – review & editing. Marina Coquery: Conceptualization, Methodology, Validation, Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was performed as part of the “TAPIOCA” project funded by the French National Office for Biodiversity (OFB) and the Ecophyto II program. The authors thank the OFB’s Réseau de surveillance prospective’ and Céline Guillemain for the acquisition of UHPLC-HRMS data on surface water samples. We thank Sylvain Merel for providing access to the predictions done using Meteor Nexus and Zeneth software packages. We also thank MetaForm Langues for English editing. We are grateful to reviewers and

References (48)

  • E. Fonseca et al.

    Investigation of pesticides and their transformation products in the Júcar River Hydrographical Basin (Spain) by wide-scope high-resolution mass spectrometry screening

    Environ. Res.

    (2019)
  • M. Ibáñez et al.

    UHPLC-QTOF MS screening of pharmaceuticals and their metabolites in treated wastewater samples from Athens

    J. Hazard. Mater.

    (2017)
  • B. Jiao et al.

    Identification and ecotoxicity prediction of pyrisoxazole transformation products formed in soil and water using an effective HRMS workflow

    J. Hazard. Mater.

    (2022)
  • D. Kang et al.

    Identification of transformation products to characterize the ability of a natural wetland to degrade synthetic organic pollutants

    Water Res.

    (2020)
  • K. Kiefer et al.

    New relevant pesticide transformation products in groundwater detected using target and suspect screening for agricultural and urban micropollutants with LC-HRMS

    Water Res.

    (2019)
  • F. Le Cor et al.

    Occurrence of pesticides and their transformation products in headwater streams: contamination status and effect of ponds on contaminant concentrations

    Sci. Total Environ.

    (2021)
  • S. Li et al.

    Endocrine disrupting effects of tebuconazole on different life stages of zebrafish (Danio rerio)

    Environ. Pollut.

    (2019)
  • A.D. McEachran et al.

    A comparison of three liquid chromatography (LC) retention time prediction models

    Talanta

    (2018)
  • M.-C. Nika et al.

    Chlorination of benzothiazoles and benzotriazoles and transformation products identification by LC-HR-MS/MS

    J. Hazard. Mater.

    (2017)
  • C. Postigo et al.

    Synthetic organic compounds and their transformation products in groundwater: occurrence, fate and mitigation

    Sci. Total Environ., 503-

    (2015)
  • R.M. de Souza et al.

    Occurrence, impacts and general aspects of pesticides in surface water: a review

    Process Saf. Environ. Prot.

    (2020)
  • V. Storck et al.

    Identification and characterization of tebuconazole transformation products in soil by combining suspect screening and molecular typology

    Environ. Pollut.

    (2016)
  • J. Zhou et al.

    Profiling microbial removal of micropollutants in sand filters: biotransformation pathways and associated bacteria

    J. Hazard. Mater.

    (2022)
  • P. Bonini et al.

    Retip: retention time prediction for compound annotation in untargeted metabolomics

    Anal. Chem.

    (2020)
  • Cited by (7)

    View all citing articles on Scopus
    View full text