Prediction of antibiotic-resistance genes occurrence at a recreational beach with deep learning models
Graphical abstract
Introduction
The emergence of antibiotic resistance genes (ARGs) as aquatic environment contaminants (Pruden et al., 2006) has become a significant global threat to human public health. ARGs are released from landfills or sludge through runoff, and they can flow into recreational areas along the coast (Zhang et al., 2016b). Specifically, recreational beaches are susceptible to ARG contamination through various sources such as wastewater treatment plants (Proia et al., 2018), animal feed mills (Fang et al., 2018), and storm runoff (Joy et al., 2013). Hence, in a previous study, surfers were found to be 4.2 times more likely to be exposed to ARGs than non-surfers in the swimming areas in England (Leonard et al., 2018). The rainfall effect is known to naturally dilute ARGs; however, ARGs are not sufficiently managed in marine environments because of current global wastewater management practices (Bedri et al., 2015; Law and Tang, 2016).
Monitoring of ARGs at recreational beaches is required for beach user safety. However, ARG monitoring has the following limitations. Conventional analysis methods are time-consuming; it takes 5.2 d on average to verify incubation results (McAdam et al., 2012). Current molecular biological techniques such as quantitative polymerase chain reaction (qPCR) have been used to identify and quantify certain ARGs (de Castro et al., 2014; Schmieder and Edwards, 2012). Although qPCR is simpler and faster compared to conventional techniques such as the culture method or traditional PCR (Kralik and Ricchi, 2017; Smith and Osborn, 2009), regular monitoring is restricted due to the high cost of qPCR analysis (Sakthivel et al., 2012). Although multiplex PCR has been developed to save time and effort by reacting multiple single PCRs simultaneously, it is less accurate because it responds to nonspecific amplification products (Jansen et al., 2011; Sakthivel et al., 2012). Therefore, for preemptive responses within a limited timeframe for ARG occurrences at beaches, prediction through modeling can be more efficient than through monitoring.
Long short-term memory (LSTM), a type of recurrent neural network (RNN), has been widely used as an efficient tool to simulate and predict water quality due to an ability to extract features from time-series data (lin Hsu et al., 1997). For example, Barzegar et al. (2020) recently utilized LSTM and LSTM hybrid models to predict water quality variables in a lake. An advantage of LSTM is that it can use memory to learn features over time. Accordingly, it is considered a suitable neural network (NN) for predicting pollutant distributions and water quality over time (Wang et al., 2019; Wang et al., 2017). On the other hand, hydrological models suffer from higher uncertainties because of their inability to simulate complex mechanistic relationships among environmental variables (Abimbola et al., 2020). Although deep learning models are black box models, they can improve performance by training from observation data (Andrychowicz et al., 2016) and simulate nonlinear phenomena occurring in the environment. In particular, deep learning models have been widely used to enhance the prediction performance of hydrological models (Parmar et al., 2017; Sumi et al, 2012). Therefore, hypothetically, it is expected that the accuracy of deep learning will be higher than that of hydrological models to predict ARGs at a recreational beach of Korea affected by rain in a short period.
Based on the collected literature, however, the potential of LSTM has yet to be utilized to estimate ARGs released into the environment. We previously observed the occurrence of ARGs at a combined sewer overflow (CSO) site in Gwangalli Beach over time, which varied in relation to rainfall and tides (Jang et al., 2021). Recreational activities at the beach are concentrated in the summer and the beach is annually affected by monsoon weather. Therefore, ARG prediction is significant for preserving the health of beachgoers, and the application of LSTM would be promising in predicting the occurrence of ARGs over time. Therefore, in this study, we propose an approach based on NN techniques to predict ARGs occurrence quickly and accurately for managing and monitoring their occurrence in beach environments. This study compared conventional LSTM, LSTM-convolutional NN (CNN), and input attention (IA)-LSTM models (Fig. 1) with the following objectives: 1) to propose applicable models for predicting four major ARGs (i.e., aac(6′-Ib-cr), blaTEM, sul1, and tetX) at a recreational beach, 2) to compare model accuracies when predicting single ARG individually and multiple ARGs simultaneously, and 3) to determine critical environmental features for predicting ARG occurrences.
Section snippets
Sampling location and period
Gwangalli Beach, a popular beach in South Korea, was selected as the study area. The eastern coast of Gwangalli Beach is adjacent to the Suyeong River estuary, which consists of urban areas and has a wastewater treatment plant and several sewer outlets along the river (Fig. 2). The total area of the beach is 82 000 m2; the beach is 1.4 km in length and 25–110 m in width along the coastline (Choi et al., 2016). Seawater samplings were conducted at a CSO outfall on the right side of the beach (
Hyperparameter optimization for single ARG prediction
By comparing the partial dependence plots in objective plots of all the models (Figs. S4 and S5), we can infer that the learning rate was the most sensitive parameter during optimization. In contrast, the activation function in the CNN layer was the least sensitive parameter for the NN across models. The optimization results also demonstrated that a higher lookback value resulted in a greater reduction in the model test MSEs in all cases except for blaTEM. No uniform trend for batch size can be
Conclusions
The goal of this study was to improve the accuracy of predictions for ARG occurrence and to identify the variables that affect these predictions. Thus, in this study, the conventional LSTM, LSTM-CNN hybrid, and IA-LSTM models were compared to predict ARGs occurrence according to environmental variables. The primary results of this study are as follows:
- 1)
The sequential convergence of LSTM and CNN resulted in improved performance compared to that of conventional LSTM to predict single ARGs. We show
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This study was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2017R1D1A1B04033074), and Korea Environment Industry and Technology Institute (KEITI) through the Aquatic Ecosystem Conservation Research Program funded by Korea Ministry of Environment (MOE) (No. 2020003030003).
Reference (53)
- et al.
Predicting Escherichia coli loads in cascading dams with machine learning: An integration of hydrometeorology, animal density and grazing pattern
Science of The Total Environment
(2020) - et al.
Human microbiomes and antibiotic resistance
Human Microbiome Journal
(2018) - et al.
Assessing the water quality response to an alternative sewage disposal strategy at bathing sites on the east coast of Ireland
Marine pollution bulletin
(2015) - et al.
Dissemination of antibiotic resistance genes and human pathogenic bacteria from a pig feedlot to the surrounding stream and agricultural soils
Journal of hazardous materials
(2018) - et al.
Fate of antibiotic resistance genes in mesophilic and thermophilic anaerobic digestion of chemically enhanced primary treatment (CEPT) sludge
Bioresource Technology
(2017) - et al.
Hydrometeorological Influence on Antibiotic-Resistance Genes (ARGs) and Bacterial Community at a Recreational Beach in Korea
Journal of hazardous materials
(2021) - et al.
Development and evaluation of a four-tube real time multiplex PCR assay covering fourteen respiratory viruses, and comparison to its corresponding single target counterparts
Journal of Clinical Virology
(2011) - et al.
Quantitative and qualitative changes in antibiotic resistance genes after passing through treatment processes in municipal wastewater treatment plants
Science of The Total Environment
(2017) - et al.
Exposure to and colonisation by antibiotic-resistant E. coli in UK coastal water users: Environmental surveillance, exposure assessment, and epidemiological study (Beach Bum Survey)
Environment International
(2018) - et al.
Occurrence and persistence of carbapenemases genes in hospital and wastewater treatment plants and propagation in the receiving river
Journal of hazardous materials
(2018)
Comparison of fast-track diagnostics respiratory pathogens multiplex real-time RT-PCR assay with in-house singleplex assays for comprehensive detection of human respiratory viruses
Journal of Virological Methods
Thermophilic anaerobic digestion: Effect of start-up strategies on performance and microbial community
Science of The Total Environment
Exploring the application of artificial intelligence technology for identification of water pollution characteristics and tracing the source of water quality pollutants
Science of The Total Environment
Water quality prediction method based on LSTM neural network
Occurrence of antibiotic resistance genes in landfill leachate treatment plant and its effluent-receiving soil and surface water
Environmental Pollution
Tensorflow: A system for large-scale machine learning
Feature engineering for machine learning: principles and techniques for data scientists
End-to-end attention-based large vocabulary speech recognition
Short-term water quality variable prediction using a hybrid CNN–LSTM deep learning model
Stochastic Environmental Research and Risk Assessment
Effects of Rainfall on Microbial Water Quality on Haeundae and Gwangan Swimming Beach
Journal of Bacteriology and Virology
Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek
Elements of information theory
Insights into novel antimicrobial compounds and antibiotic resistance genes from soil metagenomes
Frontiers in Microbiology
Neural network model for a mechanism of pattern recognition unaffected byshift in position
Neocognitron.Trans. IECE
Cited by (22)
Long short-term memory models of water quality in inland water environments
2023, Water Research XDeep learning-based algorithms for long-term prediction of chlorophyll-a in catchment streams
2023, Journal of Hydrology
- 1
These authors contributed equally to this study.