Predicting crop root concentration factors of organic contaminants with machine learning models

doi:10.1016/j.jhazmat.2021.127437

Journal of Hazardous Materials

Volume 424, Part B, 15 February 2022, 127437

https://doi.org/10.1016/j.jhazmat.2021.127437 Get rights and content

Highlights

•
FCNN model achieved the best prediction performance for RCFs.
•
Machine learning models performed better than traditional linear regression model.
•
Machine learning can identify important property descriptors for predicting RCFs.
•
Machine learning can learn complex relationships in contaminant-soil-plant systems.

Abstract

Accurate prediction of uptake and accumulation of organic contaminants by crops from soils is essential to assessing human exposure via the food chain. However, traditional empirical or mechanistic models frequently show variable performance due to complex interactions among contaminants, soils, and plants. Thus, in this study different machine learning algorithms were compared and applied to predict root concentration factors (RCFs) based on a dataset comprising 57 chemicals and 11 crops, followed by comparison with a traditional linear regression model as the benchmark. The RCF patterns and predictions were investigated by unsupervised t-distributed stochastic neighbor embedding and four supervised machine learning models including Random Forest, Gradient Boosting Regression Tree, Fully Connected Neural Network, and Supporting Vector Regression based on 15 property descriptors. The Fully Connected Neural Network demonstrated superior prediction performance for RCFs (R² = 0.79, mean absolute error [MAE] = 0.22) over other machine learning models (R² = 0.68–0.76, MAE = 0.23–0.26). All four machine learning models performed better than the traditional linear regression model (R² = 0.62, MAE = 0.29). Four key property descriptors were identified in predicting RCFs. Specifically, increasing root lipid content and decreasing soil organic matter content increased RCFs, while increasing excess molar refractivity and molecular volume of contaminants decreased RCFs. These results show that machine learning models can improve prediction accuracy by learning nonlinear relationships between RCFs and properties of contaminants, soils, and plants.

Graphical Abstract

Introduction

Release of organic chemicals into the agroecosystem occurs intentionally through agrichemical application and unintentionally through atmospheric deposition, contaminated soil amendments (such as manure or biosolids) or the use of contaminated irrigation water (Jiang et al., 2009, Li et al., 2011, Qi et al., 2020, Yurdakul et al., 2019). Plants can act as vectors for transferring chemicals from the environment to the food chain. Contaminant uptake by crop roots from contaminated soils is key to subsequent translocation and accumulation in plants (Pullagurala et al., 2018). Thus, extensive laboratory and field studies have been conducted to measure the transfer of many chemicals from soil to plant tissues (Doucette et al., 2018). Specifically, the transfer of organic contaminants from soils to crop roots is usually evaluated by root concentration factors (RCFs). RCFs are defined as the ratio of contaminant concentration in root to that in soil by assuming an equilibrium state for contaminants sorbed by the soil, dissolved in soil pore water, and accumulated in plant roots (Torralba Sanchez et al., 2017, McKone and Maddalena, 2007). Therefore, the interactions among chemicals, soils, and plants collectively determine the RCFs. These interactions include sorption–desorption of contaminants between soil and pore water, and plant root uptake of contaminants from soils (Li et al., 2019). The complex interactions of contaminants in plant-soil-water systems have most commonly been studied in experimental systems with limited soil and plant types for a small number of chemicals. The limited data and poor coverage of both chemical and environmental (plant and soil) properties do not allow for a systematic evaluation that links the influences of chemical, soil and plant properties to RCFs. In addition, new chemicals are being developed and discharged to the environment every year, and their uptake by crops may not be measured in a timely manner. Therefore, an alternative approach to lab and field experiments is to develop reliable prediction models, which can be used as a rapid screening tool to initially evaluate the potential transfer of contaminants from soils to crops (McKone and Maddalena, 2007, Mamy et al., 2015).

Most existing models that predict plant uptake of organic contaminants from soil are either empirical regression models or mechanistic models. Empirical models have traditionally relied upon limited physicochemical properties (e.g., logKow and molecular weight) (Collins et al., 2006), while mechanistic models were developed based on the assumption of several uptake processes (Feng et al., 2019, Fantke et al., 2011). For example, Topp et al. (1986) proposed a simple linear regression model based on molecular weight to predict plant uptake of organic chemicals from soil. Ryan et al. (1988) combined empirical relationships of Briggs et al. (1982) with a simple partition model to build a screening model for assessing the uptake of non-ionic chemicals from soils. Trapp and Matthies (1995) developed a one-compartment model for uptake of organic chemicals by foliar vegetables incorporating a number of different uptake processes and potentially significant loss mechanisms. Chiou et al. (2001a) proposed a mechanistic partition-limited model for the passive root uptake of contaminants from soils. These models are either limited by the number of properties considered, or the oversimplification of the complex soil-plant-chemical interactions in several governing equations based on the assumed uptake processes. Hence it is challenging for these models to be generalized to broader spectrum of chemicals, plants, and soils.

In contrast, machine learning models do not rely on the assumed mechanisms to discover data patterns and can learn complex relationships between input features and predicted targets (Ahmad et al., 2021, Chen et al., 2021). They can learn the complex functions directly from data by training large numbers of parameters, which makes it possible to produce accurate predictions when underlying mechanisms are not completely understood or parameterized. There are in general two types of machine learning models, depending on whether training labels (e.g., RCF values in this study) are used. Supervised machine learning models utilize labels to train the model, while unsupervised machine learning models do not require labels. Common supervised machine learning models include Gradient Boosting Regression Tree (GBRT) (Friedman, 2001), Fully Connected Neural Network (FCNN) (Rumelhart et al., 1986), Supporting Vector Regression (SVR) (Cybenko, 1989), and Random Forest (RF) (Breiman, 2001) among many others. In fact, various machine learning models were developed based on these classic models. For example, LightBoost and XGBoost are variants of GBRT, convolutional neural network is one type of neural networks (NN), and DeepForest is a variant of RF. Recently, machine learning models have been increasingly used in environmental applications with varying performance. For example, RF was previously shown to perform the best among GBRT, FCNN, SVR, and RF in predicting chemical ecotoxicity (HC₅₀) (Hou et al., 2020a, Hou et al., 2020c). Deep neural network models were widely used to predict chemical properties with superior performance (Feinberg et al., 2018, Walters and Barzilay, 2020). More specifically, Bagheri et al. (2020) developed a NN model to predict RCFs from hydroponic studies using multiple physicochemical properties including molecular weight, logKow, rotatable bonds, hydrogen bond donor, hydrogen bond acceptor, and polar surface area. Gao et al. (2021) built a GBRT model with ECFP4 fingerprints to predict RCFs from soil concentrations. However, there has been no systematic comparison of various machine learning models for predicting RCFs, which is essential to the future application of machine learning in risk assessment.

Additionally, unsupervised machine learning models can visualize and discover patterns from data. However, in the previous studies supervised learning was usually used to predict RCFs, whereas unsupervised methods are less explored. For example, t-SNE is a popular unsupervised learning method and has been successfully used in visualizing high dimensional data (e.g., discovering cell patterns from single cell sequencing data) (Belkina et al., 2019, Kobak and Berens, 2019). Recently, it has also been applied to visualize per-and polyfluoroalkyl substances C-F bond energy patterns based on molecular descriptors (Raza et al., 2019). The ability of t-SNE to reveal local similarities of data points in the dataset makes it a valuable tool for exploring high dimensional data (Van der Maaten and Hinton, 2008), which can be used for evaluating RCFs from a multitude of chemical, soil, and plant properties. Finally, the crop uptake of organic contaminants from soils involves the complex interactions among contaminants, soils, and plants. However, the analysis of important properties or features for RCF prediction has not been fully studied. Identifying important properties related to RCF prediction can enhance our understanding of plant root uptake of organic contaminants.

To fill the aforementioned knowledge gaps, this study aimed to predict the crop RCFs from the properties (features) of contaminants, soils, and plants using both supervised and unsupervised machine learning methods. Briefly, the patterns from the collected crop RCF dataset were first investigated with an unsupervised machine learning algorithm, namely t-distributed stochastic neighbor embedding (t-SNE) (Van der Maaten and Hinton, 2008). The uptake of organic chemicals by crops from soils was then predicted using four supervised machine learning models including GBRT, FCNN, RF, and SVR. The performances of the four machine learning models were systematically compared. Furthermore, two different feature importance analysis methods and individual conditional expectation analysis were performed. These analyses identified key parameters for predicting RCFs and revealed the complex relationships among target chemicals, plant, soil properties, and RCFs.

Section snippets

RCF dataset

An RCF database for crop uptake of organic contaminants was compiled. The data were initially collected and screened from previous peer-reviewed studies (published 1959–2020) using Web of Science and searching terms that included “organic contaminants”, “plant uptake from soils”, “bioaccumulation factors”, and “bioconcentration factors”. Only studies that reported soil organic matter content (f_om) and accessible crop root lipid content (f_lipid) were included in the dataset. These two soil and

t-SNE plot of the RCF dataset

The RCF dataset was first explored with t-SNE to examine the effectiveness of property representation of chemicals, plants and soils. As an unsupervised machine learning method, t-SNE can find patterns of RCFs from other input features without using any RCF data. The 243 data points based on their property descriptors were clustered in Fig. 2 using t-SNE and then colored according to their corresponding RCF values. By reducing the high dimensional data into 2D space, data points with similar

Conclusions

With increasing exposure of organic chemicals to agroecosystems, it is essential to assess their uptake and accumulation in food crops. Accurate prediction of RCFs from soil has been challenging due to the complex interactions among chemicals, soils, and plants. Existing empirical regression models and mechanistic models were usually limited by their prediction abilities for diverse types of chemicals, plants, and soils. Emerging machine learning models provide new methodologies for predicting

CRediT authorship contribution statement

Feng Gao: Conceptualization, Methodology, Formal analysis, Software, Investigation, Resources, Visualization, Writing – original draft, Writing – review & editing. Yike Shen: Conceptualization, Investigation, Writing – review & editing. J. Brett Sallach: Conceptualization, Writing – review & editing. Hui Li: Conceptualization, Writing – review & editing. Wei Zhang: Conceptualization, Writing – review & editing. Yuanbo Li: Conceptualization, Funding, Writing – review & editing. Cun Liu:

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The work was supported by the National Key Research and Development Program of China, China (2019YFC1604503 and 2016YFD0800403).

References (62)

M.B. Ahmad et al.
Adsorption of Indigo Carmine dye onto the surface-modified adsorbent prepared from municipal waste and simulation using deep neural network
J. Hazard. Mater.
(2021)
M. Bagheri et al.
Examining plant uptake and translocation of emerging contaminants using machine learning: implications to food security
Sci. Total Environ.
(2020)
W. Chen et al.
Evaluation of different boosting ensemble machine learning models and novel deep learning and boosting framework for head-cut gully erosion susceptibility
J. Environ. Manag.
(2021)
Y.-H. Chuang et al.
Mechanistic study on uptake and transport of pharmaceuticals in lettuce from water
Environ. Int.
(2019)
P. Fantke et al.
Plant uptake of pesticides and human health: dynamic modeling of residues in wheat and ingestion intake
Chemosphere
(2011)
X. Feng et al.
Dynamic modeling of famoxadone and oxathiapiprolin residue on cucumber and Chinese cabbage based on tomato and lettuce archetypes
J. Hazard. Mater.
(2019)
C.E. Golden et al.
Comparison between random forest and gradient boosting machine methods for predicting Listeria spp. prevalence in the environment of pastured poultry farms
Food Res. Int.
(2019)
P. Hou et al.
Estimate ecotoxicity characterization factors for chemicals in life cycle assessment using machine learning models
Environ. Int.
(2020)
W.-N. Hung et al.
Lipid–water partition coefficients and correlations with uptakes by algae of organic compounds
J. Hazard. Mater.
(2014)
Y.-F. Jiang et al.
Occurrence, distribution and possible sources of organochlorine pesticides in agricultural soil of Shanghai, China
J. Hazard. Mater.
(2009)

Q. Liu et al.

Uptake kinetics and accumulation of pesticides in wheat (Triticum aestivum L.): impact of chemical and plant properties

Environ. Pollut.

(2021)

H. Li et al.

Fishpond sediment-borne DDTs and HCHs in the Pearl River Delta: characteristics, environmental risk and fate following the use of the sediment as plant growth media

J. Hazard. Mater.

(2011)

H.-X. Luo et al.

Comparison of machine learning algorithms for mapping mango plantations based on Gaofen-1 imagery

J. Integr. Agric.

(2020)

V.L.R. Pullagurala et al.

Plant uptake and translocation of contaminants of emerging concern in soil

Sci. Total Environ.

(2018)

P. Qi et al.

Investigation of polycyclic aromatic hydrocarbons in soils from Caserta provincial territory, southern Italy: Spatial distribution, source apportionment, and risk assessment

J. Hazard. Mater.

(2020)

J. Ryan et al.

Plant uptake of non-ionic organic chemicals from soils

Chemosphere

(1988)

E. Topp et al.

Factors affecting the uptake of 14C-labeled organic chemicals by plants from soil

Ecotoxicol. Environ. Saf.

(1986)

Z. Yang et al.

Performance of the partition-limited model on predicting ryegrass uptake of polycyclic aromatic hydrocarbons

Chemosphere

(2007)

S. Yurdakul et al.

Levels, temporal/spatial variations and sources of PAHs and PCBs in soil of a highly industrialized area

Atmos. Pollut. Res.

(2019)

X. Zhan et al.

Influence of plant root morphology and tissue composition on phenanthrene uptake: stepwise multiple linear regression analysis

Environ. Pollut.

(2013)

X. Zhu et al.

The application of machine learning methods for prediction of metal sorption onto biochars

J. Hazard. Mater.

(2019)

A. Altmann et al.

Permutation importance: a corrected feature importance measure

Bioinformatics

(2010)

A.C. Belkina et al.

Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets

Nat. Commun.

(2019)

L. Breiman

Random forests

Mach. Learn

(2001)

G.G. Briggs et al.

Relationships between lipophilicity and root uptake and translocation of non‐ionised chemicals by barley

Pestic. Sci.

(1982)

L.J. Carter et al.

Fate and uptake of pharmaceuticals in soil–plant systems

J. Agric. Food Chem.

(2014)

C.T. Chiou et al.

A partition-limited model for the plant uptake of organic contaminants from soil and water

Environ. Sci. Technol.

(2001)

Collins, C., Martin, I., Fryer, M., 2006. Evaluation of models for predicting plant uptake of chemicals from soil,...

G. Cybenko

Approximation by superpositions of a sigmoidal function

Math. Control. Signals, Syst.

(1989)

W.J. Doucette et al.

A review of measured bioaccumulation data on terrestrial plants for organic chemicals: metrics, variability, and the need for standardized measurement protocols

Environ. Toxicol. Chem.

(2018)

P. Ertl et al.

Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties

J. Med. Chem.

(2000)

Cited by (23)

Bioavailability (BA)-based risk assessment of soil heavy metals in provinces of China through the predictive BA-models
2024, Journal of Hazardous Materials
The real biological effect is not generated by the total content of heavy metals (HMs), but rather by bioavailable content. A new bioavailability-based ecological risk assessment (BA-based ERA) framework was developed for deriving bioavailability-based soil quality criteria (BA-based SQC) and accurately assessing the ecological risk of soil HMs at a multi-regional scale in this study. Through the random forest (RF) models and BA-based ERA framework, the 217 BA-based SQC for HMs in 31 Chinese provinces were derived and the BA-based ERA was comprehensively assessed. This study found that bioavailable HMs extraction methods (BHEMs) and total HMs content play the predominant role in affecting HMs (As, Cd, Cr, Cu, Ni, Pb, and Zn) bioavailability by explaining 27.55–56.11% and 9.20–62.09% of the variation, respectively. The RF model had accurate and stable prediction ability for the bioavailability of soil HMs with the mean R² and RMSE of 0.83 and 0.43 for the test set, respectively. The results of BA-based ERA showed that bioavailability could avoid the overestimation of ecological risks to some extent after reducing the uncertainty of soil differences. This study confirmed the feasibility of using bioavailability for ERA and will utilised to revise the soil environmental standards based on bioavailability for HMs.
Congener-specific uptake and accumulation of bisphenols in edible plants: Binding to prediction of bioaccumulation by attention mechanism multi-layer perceptron machine learning model
2023, Environmental Pollution
Plant accumulation of phenolic contaminants from agricultural soils can cause human health risks via the food chain. However, experimental and predictive information for plant uptake and accumulation of bisphenol congeners is lacking. In this study, the uptake, translocation, and accumulation of five bisphenols (BPs) in carrot and lettuce plants were investigated through hydroponic culture (duration of 168 h) and soil culture (duration of 42 days) systems. The results suggested a higher bioconcentration factor (BCF) of bisphenol AF (BPAF) in plants than that of the other four BPs. A positive correlation was found between the log BCF and the log K_ow of BPs (R²_carrot = 0.987, R²_lettuce = 0.801, P < 0.05), while the log (translocation factor) exhibited a negative correlation with the log K_ow (R²_carrot = 0.957, R²_lettuce = 0.960, P < 0.05). The results of molecular docking revealed that the lower binding energy of BPAF with glycosyltransferase, glutathione S-transferase, and cytochrome P450 (−4.34, −4.05, and −3.52 kcal/mol) would be responsible for its higher accumulation in plants. Based on the experimental data, an attention mechanism multi-layer perceptron (AM-MLP) model was developed to predict the BCF of eight untested BPs by machine learning, suggesting the relatively high BCF of bisphenol BP, bisphenol PH, and bisphenol TMC (BCF_carrot = 1.37, 1.50, 1.03; BCF_lettuce = 1.02, 0.98, 0.67). The prediction of BCF for ever-increasing varieties of BPs by machine learning would reduce repetitive experimental tests and save resources, providing scientific guidance for the production and application of BPs from the perspective of priority pollutants.
Contribution of molecular structures and quantum chemistry technique to root concentration factor: An innovative application of interpretable machine learning
2023, Journal of Hazardous Materials
Root concentration factor (RCF) is a significant parameter to characterize uptake and accumulation of hazardous organic contaminants (HOCs) by plant roots. However, complex interactions among chemicals, plant roots and soil make it challenging to identify underlying mechanisms of uptake and accumulation of HOCs. Here, nine machine learning techniques were applied to investigate major factors controlling RCF based on variable combinations of molecular descriptors (MD), MACCS fingerprints, quantum chemistry descriptors (QCD) and three physicochemical properties related to chemical-soil-plant system. Compared to models with variables including MACCS fingerprints or solitary physicochemical properties, the XGBoost-6 model developed by the variable combination of MD, QCD and three physicochemical properties achieved the most remarkable performance, with R² of 0.977. Model interpretation achieved by permutation variable importance and partial dependence plots revealed the vital importance of HOCs lipophilicity, lipid content of plant roots, soil organic matter content, the overall deformability and the molecular dispersive ability of HOCs for regulating RCF. The integration of MD and QCD with physicochemical properties could improve our knowledge of underlying mechanisms regarding HOCs accumulation in plant roots from innovative structural perspectives. Multiple variables combination-oriented performance improvement of model can be extended to other parameters prediction in environmental risk assessment field.
Assessment of organic micropollutants rejection by forward osmosis system using interpretable machine learning-assisted approach: A new perspective on optimization of multifactorial forward osmosis process
2023, Journal of Environmental Chemical Engineering
Organic micropollutants (OMPs) such as pharmaceutical, personal care products, pesticides and industrial chemicals in wastewater can threaten environment and human health. Forward osmosis (FO) is a promising technique for OMPs rejection with high anti-fouling capacity and low energy demand. However, OMPs rejection in FO system is a complex multifactorial process. It is unrealistic to assess OMPs rejection by FO process via repetitive trial-and-error experimentation. Here, an interpretable machine learning (ML)-assisted strategy was presented to optimize the OMPs retention in FO process. 18 influential factors associated with membrane properties, OMPs properties and experimental conditions were used for 10 ML models development. The optimal XGBoost-18 model was determined by performance evaluation based on multiple metrics. Interpretation for the optimal model was achieved through Shapley additive explanations (SHAP). The results showed that McGowan volume of OMPs, molecular weight of OMPs, zeta potential of FO membrane surface, and osmotic pressure of draw solution significantly affected OMPs rejection in FO process. 11 representative input features were selected from the original 18 variables based on the feature importance analysis provided by SHAP. On this basis, the SHAP-XGBoost-11 model was trained and achieved the most accurate prediction (R²_adj = 98 %) for OMPs rejection. The current study provides a new perspective for more efficient experimental optimization of FO system, aiming to achieve the highest rejection of target OMPs in the future. In addition, the findings in this study suggested a referential logical framework the expansion of FO process to a broader application scale.
Wastewater-derived contaminants of emerging concern: Concentrations in soil solution under simulated irrigation scenarios
2023, Soil and Environmental Health
In response to declineing natural water sources, treated wastewater has been introduced into the water cycle as a new water source for irrigation. However, this practice exposes the agricultural environment to various contaminants of emerging concern. To better understand their fate in the soil and to effectively predict their bioavailability for plant uptake, there is a need to quantify their concentrations in soil solutions. In this study, we examined the concentrations of treated wastewater-derived contaminants of emerging concern in soil solutions under three scenarios: (1) shifting from irrigation with freshwater to treated wastewater (FW→TWW scenario), (2) long-term continuous irrigation with treated wastewater (TWW→TWW scenario), and (3) prolonged irrigation with treated wastewater followed by freshwater (TWW→FW scenario). Contaminants of emerging concern including carbamazepine, 1H-benzotriazole, lamotrigine, venlafaxine, and thiabendazole were ubiquitous in the treated wastewater (mean concentrations of 125, 945, 180, 3630, and 90 ng/L, respectively) and irrigated soils. Interestingly, their concentrations in the soil solutions were different (higher or lower) from the corresponding concentrations in the irrigation water. In both the freshwater to wastewater (FW→TWW) and treated wastewater to freshwater (TWW→FW) irrigation scenarios, lower contaminant concentrations were observed in soil solutions compared to the prolong treated wastewater irrigation scenario (TWW→TWW), indicating that a steady state condition was not achieved after a single irrigation season. For example, the concentrations of 1H-benzotriazole in Nir Oz soil solutions were 638, 310, and 1577 ng/L for the three irrigation scenarios, respectively. Moreover, the contaminants concentrations in soil solutions were slightly lower in the TWW→FW irrigation scenario compared to the TWW→TWW scenario. Our data suggest that rain-fed crops are also exposed to treated wastewater-derived contaminants of emerging concern released from the adsorbed phase into the soil solution. The readily-available contaminants concentration in soil solution depends on the physicochemical properties of the molecule, the water type used for irrigation and the irrigation history, the contaminant concentration in the irrigation water, and soil characteristics.
Bioaccessibility of arsenic, lead, and cadmium in contaminated mining/smelting soils: Assessment, modeling, and application for soil environment criteria derivation
2023, Journal of Hazardous Materials
Citation Excerpt :
The RF model was evaluated along with two regression models (PLSR and RR) for the bioaccessibility of PHEs after the log transformation of the target pollutant dataset in the study (Table S8). RF is an integrated learning method that fits multiple decision trees on different subsets of the dataset and averages the results of each tree to improve prediction accuracy and control overfitting (Gao et al., 2022). RR constructs a linear regression model by constraining the magnitude of the regression coefficients only by adding a penalty term to the fitting error.
Soil environment criteria (SEC) are commonly derived from the total concentration of pollutants in soils, resulting in overly stringent values. Herein, we examined the feasibility of deriving the SEC by using the bioaccessibility of pollutants. In this regard, soil samples from 33 locations at 12 mining/smelting sites in China were collected and examined in terms of soil properties, chemical fraction distributions, and bioaccessibilities of cadmium (Cd), lead (Pb), and arsenic (As). The gastric (GP) and intestinal phases (IP) of the potentially hazardous trace elements (PHEs) were measured by in vitro assays, showing that these values varied from 11 % to 72 %, 1–79 %, and 2–27 % for Cd, Pb and As, respectively. Pearson analysis showed that the GP and IP bioaccessibilities of these PHEs were mainly influenced by soil pH, CEC, and clay fraction and positively correlated with the sequential extraction form. The random forest regression (RF) model showed excellent performance in predicting the gastric phase (GP) bioaccessibilities of Cd, Pb, and As, with a mean R2 and RMSE of 0.86 and 0.31, respectively. Both the measured and predicted bioaccessibilities were feasible to be used to derive SEC. This work will contribute to the development of regional soil environmental standards based on bioaccessibility for Cd-, Pb-, and As-contaminated mining/smelting soils.

View all citing articles on Scopus

¹: Equal contribution.

View full text

Predicting crop root concentration factors of organic contaminants with machine learning models

Highlights

Abstract

Graphical Abstract

Introduction

Section snippets

RCF dataset

t-SNE plot of the RCF dataset

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgements

J. Hazard. Mater.

Sci. Total Environ.

J. Environ. Manag.

Environ. Int.

Chemosphere

J. Hazard. Mater.

Food Res. Int.

Environ. Int.

J. Hazard. Mater.

J. Hazard. Mater.

Environ. Pollut.

J. Hazard. Mater.

J. Integr. Agric.

Sci. Total Environ.

J. Hazard. Mater.

Chemosphere

Ecotoxicol. Environ. Saf.

Chemosphere

Atmos. Pollut. Res.

Environ. Pollut.

J. Hazard. Mater.

Permutation importance: a corrected feature importance measure

Bioinformatics

Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets

Nat. Commun.

Random forests

Mach. Learn

Relationships between lipophilicity and root uptake and translocation of non‐ionised chemicals by barley

Pestic. Sci.

Fate and uptake of pharmaceuticals in soil–plant systems

J. Agric. Food Chem.

A partition-limited model for the plant uptake of organic contaminants from soil and water

Environ. Sci. Technol.

Approximation by superpositions of a sigmoidal function

Math. Control. Signals, Syst.

A review of measured bioaccumulation data on terrestrial plants for organic chemicals: metrics, variability, and the need for standardized measurement protocols

Environ. Toxicol. Chem.

Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties

J. Med. Chem.