Enhanced database creation with in silico workflows for suspect screening of unknown tebuconazole transformation products in environmental samples by UHPLC-HRMS
Graphical Abstract
Introduction
Pesticides are chemical compounds used mainly in agriculture to control plant pests and improve crop yields. Once in the environment, pesticides can be degraded into transformation products (TPs) via both biotic and abiotic transformation processes (Fenner et al., 2013, Escher and Fenner, 2011). The chemical compounds formed by these transformations processes are generally lower, more persistent in the environment and more mobile than the parent compound, which can increase their transport to surface water and groundwater by runoff or seepage from agricultural soils (Boxall et al., 2004, Postigo and Barceló, 2015). As a rule, these structural and property changes do not specifically increase the toxicity of TPs compared to parent compounds. However, within the multitude of products formed, some may be exceptions to this rule, which makes it important to identify them (Escher and Fenner, 2011). This blind-spot in identification means that the toxicity of pesticides and their TPs in water bodies is globally underestimated (Mahler et al., 2021, Moschet et al., 2014). Novel approaches are needed in order to identify these unknown TPs compounds.
The simultaneous quantification of pesticides and their known TPs in waterbodies has revealed the presence of TPs at higher levels of concentration and occurrence than their parent compounds. As an example, in headwater streams, Le Cor et al (Le Cor et al., 2021). highlighted that pesticide TPs accounted for more than half of the substances detected and that TP concentrations were often ten times higher than the parent-compound concentrations (0.46 ± 0.02 μg/L for the TP metazachlor-ESA versus 0.047 ± 0.007 μg/L for the parent metazachlor). However, such targeted analyses are limited by the lack of standards for most pesticide TPs. To overcome this gap, powerful techniques such as high-resolution mass spectrometry (HRMS) have been developed over the last decade. Gas chromatography (GC) or liquid chromatography (LC) coupled with HRMS can serve to develop suspect and non-target screening (NTS) strategies that bring a more holistic understanding of the environmental fate of organic chemicals by untangling the unknowns (Escher et al., 2020).
Suspect screening strategies involve comparing key characteristics of compounds, compiled in a database (DB), to analytical data on actual environmental samples acquired by HRMS. The minimum data required to suspect a compound in a water sample is the exact mass of the compounds of interest. Levels of confidence in suspected presence can be increased with additional compound-related data such as mass fragmentation patterns (MS/MS spectra) and chromatographic retention times (RT) (Schymanski et al., 2014). This additional data is usually obtained by injecting analytical standards into a LC or GC-HRMS instrument or is already contained in commercial or public databases, such as the NORMAN Suspect List Exchange (https://www.norman-network.com/nds/SLE/). However, when analytical standards and databases are unavailable, analysts should consider using extensive suspect screening with enhanced databases built from in silico prediction tools. Recent developments in extensive suspect screening for pesticide TPs within water bodies has made it possible to identify many new focal compounds (Fonseca et al., 2019, Kiefer et al., 2019), which underscores the value of creating improved databases for suspect screening analysis.
In silico tools are defined here as commercially or freely-available software or web platforms that use sophisticated algorithms to perform predictive tasks that would be too time-consuming or even impossible for a human to perform. The practicality of such in silico tools stems from their ability to predict compound properties solely from their chemical identifiers—as with the simplified molecular-input line-entry specification; SMILES—, thus overcoming the need for analytical standards.
Some in silico tools, called transformation predictors, can predict the formation of possible TPs by using the chemical identifiers of the parent compound as an input. These tools are based on various pre-established physicochemical reactions that can occur in various environmental compartments (e.g. aquatic, terrestrial or biological) via both abiotic and biotic transformation processes on scales running from microbial up to mammalian metabolism. The appropriate transformation predictor has to be selected based on the environmental degradation processes investigated. TPs predicted by these transformation predictors carry a relatively high rate of false-positives, but some predictors can use relative reasoning to address this issue (Bletsou et al., 2015). The efficiency of these tools has already been proven. For instance, Jiao et al (Jiao et al., 2022). recently detected 14 new TPs of the fungicide pyrisoxazole using literature data and one in silico tool, Envipath (Wicker et al., 2016), for database construction.
Another important subset of in silico tools are chromatographic RT prediction tools, which are usually based on quantitative structure–activity relationship (QSAR) models principles, extended to so-called quantitative structure–retention relationship (QSRR) models. Predictions are made based on the assumption that there are relationships between the chemical structures of the compounds and their chromatographic RTs. These prediction tools are developed from predicted or experimental molecular descriptors—which are associated with experimental chromatographic RTs—of a group of compounds. This group is generally split into two: one called the “training set” that establishes the relationship between molecular descriptors and chromatographic RT, and the other called the “testing set” that is used for validation. This group can also be divided into three, with an addition to the training and testing set of a “validation set”, which deals with any overfitting produced during the QSRR construction (Amos et al., 2018). The complexity of these QSRR models varies according to the amount and type of molecular descriptors required to build them, but also depending on the algorithms establishing the relationships, from multiple linear regression (MLR) to non-linear machine-learning (ML)-based QSRR. Taking into account the range of prediction error given by the QSRR model, the predicted chromatographic RTs can serve to eliminate outliers during suspect screening (Aalizadeh et al., 2019).
Other in silico tools can be used to annotate acquired MS/MS spectra a posteriori, such as SIRIUS (Dührkop et al., 2019), MAGMA (Ridder et al., 2012) or MetFrag (Ruttkies et al., 2016), in order to identify compounds or at least increase their confidence in detection during suspect and non-target analysis (Kiefer et al., 2019, Eysseric et al., 2021). A complementary approach consists of predicting MS/MS spectra before analytical acquisition (i.e. a priori) in order to enhance the suspect compounds database. This can be done with fragmentation predictors like competitive fragmentation modeling-ID (CFM-ID) that employ neural network algorithms for a priori prediction of MS/MS spectra based solely on SMILES compounds as an input (Djoumbou-Feunang et al., 2019, Chao et al., 2020). This addition of predicted MS/MS spectra strengthens the identification performance and limits compound mismatches during suspect screening analysis.
With that vision, a solution to better characterize water-body contamination by pesticide TPs could be to combine a selected set of these in silico tools, which are often used alone but, to our knowledge, have never been grouped into a comprehensive workflow. Here we address this gap by developing a comprehensive workflow for the creation of detailed databases for suspect screening of unknown compounds such as pesticide TPs in agricultural watersheds. Each step of this workflow allows the prediction of specific information about the TP compounds, such as their identity, chromatographic RT, and fragmentation spectra. The novelty of this approach is that it uses several in silico prediction tools based on innovative algorithms and cross-compares them together and against literature data. In addition to being easily transferable to other compounds or analytical conditions, this approach provides an enhanced ready-to-use database of a pesticide’s TPs for suspect screening analysis on environmental samples.
Section snippets
Pesticide selection
To demonstrate the potential of using a combination of in silico tools to create a suspect screening database of TPs, the triazole fungicide tebuconazole (TBZ) was used as a model compound. The main characteristics of this compound are presented in Table 1.
TBZ was selected primarily because it is one of the best-selling fungicides in the world and it has been applied for over twenty years in Europe due to its broad-spectrum activity (Cabras et al., 1997, Li et al., 2019). Moreover, the
Comparison of in silico and in biblio predictions for transformation products
The six transformation predictors used were able to predict 215 distinct TPs for TBZ. Literature search yielded 97 TPs, predominantly from the work of Storck et al (Storck et al., 2016)., and El Azhari et al. (2018) that included previous experimental studies on TBZ degradation. The full database of TBZ TPs created at this workflow step can be consulted at the following address: https://doi.org/10.57745/Y3JLTV //entrepot.recherche.data.gouv.fr/privateurl.xhtml?token=d4530060-0c3d-4ebe-ad14-38be413c7d62
Conclusions
This study proposed a comprehensive workflow for the implementation of detailed and ready-to-use databases to support suspect screening analyses of unknown compounds in agricultural watersheds. This novel workflow, combining several in silico tools, was applied on tebuconazole. It allowed the creation of a database of 291 tebuconazole transformation products, incremented with their predicted chromatographic retention times and fragment patterns.
The six transformation predictors allowed to
Environmental implication
The in silico workflow presented in our work represents an improvement in the suspect screening of transformation products, which are undeniable ubiquitous environmentally hazardous contaminants. Applied on the fungicide tebuconazole as a model compound, the workflow led to the detection of seven new transformation products in surface waters. Based on accessible and transposable in silico tools, the proposed workflow can be replicated to a wide range of organic substances and reused by other
CRediT authorship contribution statement
Kevin Rocco: Conceptualization, Investigation, Methodology, Formal analysis, Software, Writing – original draft. Christelle Margoum: Conceptualization, Methodology, Funding acquisition, Project administration, Writing – review & editing. Loïc Richard: Investigation, Methodology, Formal analysis, Writing – review & editing. Marina Coquery: Conceptualization, Methodology, Validation, Supervision, Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was performed as part of the “TAPIOCA” project funded by the French National Office for Biodiversity (OFB) and the Ecophyto II program. The authors thank the OFB’s Réseau de surveillance prospective’ and Céline Guillemain for the acquisition of UHPLC-HRMS data on surface water samples. We thank Sylvain Merel for providing access to the predictions done using Meteor Nexus and Zeneth software packages. We also thank MetaForm Langues for English editing. We are grateful to reviewers and
References (48)
- et al.
Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants
J. Hazard. Mater.
(2019) - et al.
Molecular modeling and prediction accuracy in quantitative structure-retention relationship calculations for chromatography
TrAC Trends Anal. Chem.
(2018) - et al.
Suspect screening of large numbers of emerging contaminants in environmental waters using artificial neural networks for chromatographic retention time prediction and high resolution mass spectrometry data analysis
Sci. Total Environ.
(2015) - et al.
Critical evaluation of a simple retention time predictor based on LogKow as a complementary tool in the identification of emerging contaminants in water
Talanta
(2015) - et al.
Pesticides in surface water from Brazil and Paraguay cross-border region: Screening using LC-QTOF MS and correlation with land use and occupation through multivariate analysis
Microchem. J.
(2021) - et al.
Targeted and non-targeted liquid chromatography-mass spectrometric workflows for identification of transformation products of emerging pollutants in the aquatic environment
TrAC Trends Anal. Chem.
(2015) - et al.
Suspect screening of environmental contaminants by UHPLC-HRMS and transposable quantitative structure-retention relationship modelling
J. Hazard. Mater.
(2021) - et al.
The dissipation and microbial ecotoxicity of tebuconazole and its transformation products in soil under standard laboratory and simulated winter conditions
Sci. Total Environ., 637-
(2018) - et al.
Non-targeted screening of trace organic contaminants in surface waters by a multi-tool approach based on combinatorial analysis of tandem mass spectra and open access databases
Talanta
(2021) - et al.
Evaluation and application of machine learning-based retention time prediction for suspect screening of pesticides and pesticide transformation products in LC-HRMS
Chemosphere
(2021)
Investigation of pesticides and their transformation products in the Júcar River Hydrographical Basin (Spain) by wide-scope high-resolution mass spectrometry screening
Environ. Res.
UHPLC-QTOF MS screening of pharmaceuticals and their metabolites in treated wastewater samples from Athens
J. Hazard. Mater.
Identification and ecotoxicity prediction of pyrisoxazole transformation products formed in soil and water using an effective HRMS workflow
J. Hazard. Mater.
Identification of transformation products to characterize the ability of a natural wetland to degrade synthetic organic pollutants
Water Res.
New relevant pesticide transformation products in groundwater detected using target and suspect screening for agricultural and urban micropollutants with LC-HRMS
Water Res.
Occurrence of pesticides and their transformation products in headwater streams: contamination status and effect of ponds on contaminant concentrations
Sci. Total Environ.
Endocrine disrupting effects of tebuconazole on different life stages of zebrafish (Danio rerio)
Environ. Pollut.
A comparison of three liquid chromatography (LC) retention time prediction models
Talanta
Chlorination of benzothiazoles and benzotriazoles and transformation products identification by LC-HR-MS/MS
J. Hazard. Mater.
Synthetic organic compounds and their transformation products in groundwater: occurrence, fate and mitigation
Sci. Total Environ., 503-
Occurrence, impacts and general aspects of pesticides in surface water: a review
Process Saf. Environ. Prot.
Identification and characterization of tebuconazole transformation products in soil by combining suspect screening and molecular typology
Environ. Pollut.
Profiling microbial removal of micropollutants in sand filters: biotransformation pathways and associated bacteria
J. Hazard. Mater.
Retip: retention time prediction for compound annotation in untargeted metabolomics
Anal. Chem.
Cited by (7)
Advanced suspect screening of tiamulin and its transformation products in waters: Assessing their persistence, mobility and toxicity
2024, Journal of Environmental Chemical EngineeringA comprehensive review on toxicological mechanisms and transformation products of tebuconazole: Insights on pesticide management
2024, Science of the Total EnvironmentCombining predictive and analytical methods to elucidate pharmaceutical biotransformation in activated sludge
2023, Environmental Science: Processes and Impacts