A novel Hybrid Wavelet-Locally Weighted Linear Regression (W-LWLR) Model for Electrical Conductivity (EC) Prediction in Surface Water
Introduction
Surface water resources, such as rivers, streams, lakes, and reservoirs are the most vitally important water sources for drinking, irrigation in agriculture, mining, and industrial purposes. Hence, considering the lack of fresh surface water resources, water quality monitoring and control are the most important strategies to obtain the information that leads to decision making and knowledge about water contamination and the spatial and temporal variations. Electrical conductivity (EC) based salinity is one of the most significant water quality parameters to determine the suitability of water for drinking and irrigation purposes (Kumarasamy et al., 2014). EC is normally measured in a unit of microsiemens per centimeter (μS/cm) (Heydari et al., 2013). Since EC is dominated by total dissolved solid (TDS) and is directly related to dissolved ionic solutes such as sodium (Na+), chloride (Cl−), magnesium (Mg+2), sulfate (SO4−2), and calcium (Ca+2) in water, it can be an indicator of pollutants in surface water. The increased ionic composition of water has a significant influence on plant growth and can reduce the quality of drinking water.
In surface water quality classifications, EC is a main measure of the salinity hazard for irrigation and drinking water. The U.S. Department of Agriculture (Wilcox, 1948) and the World Health Organization (2008) classify surface water quality based on the EC-sodium concentration of water and EC, respectively. The EC of freshwater usually varies between 0 and 1500 μS/cm, while the EC of sea water can be as high as 50,000 μS/cm. According to the Wilcox EC-based classification for irrigation water, 0–750 mS/cm, 750–2000 mS/cm, and >2000 mS/cm are respectively classified as fine, allowable, and unacceptable (Wilcox, 1948). The EC level higher than 10,000 μS/cm is not suitable for human consumption or irrigation. The maximum permissible EC value recommended by the World Health Organization (WHO) for drinking water is 1400 μS/cm (1993).
Nowadays, prediction of water quality parameters such as TDS, EC and turbidity in water is among the challenging issues in water resources management. Multivariate statistics and time series analysis have been widely used for water quality prediction. The commonly-used methods include moving average (MA), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), and multiple linear regression (MLR) (Çamdevýren et al., 2005; Civelekoglu et al., 2007). However, the aforementioned traditional models may not be able to provide proper predictions due to the lack of reliable instruments to collect observation data for a given timeframe, the complexity of effective factors in forecasting, and the inability to capture non-stationarity and nonlinearity of the water quality parameters (Deng et al., 2015).
In recent decades, the artificial intelligence (AI) method has been successfully utilized in various aspects of environmental engineering such as water quality prediction, which is able to overcome the drawback of the traditional methods. The artificial neural network (ANN) and the adaptive neuro-fuzzy inference system (ANFIS) techniques have been widely employed to predict nocturnal dynamics of dissolved oxygen (DO) in aquatic systems (Karakaya et al., 2013); to forecast TDS, EC, and turbidity in rivers (Najah et al., 2009); to estimate DO and specific conductance as two water quality parameters in rivers (Heydari et al., 2013); to assess EC, sodium absorption ratio (SAR), and total hardness (TH) in rivers (Azad et al., 2018); and to predict the chemical concentrations in rivers (Mahmoodabadi and Arshad, 2018). In recent years, hybrid support vector regression (SVR) and shuffled frog leaping algorithm (SFLA) were employed to forecast eight water quality parameters including Na+, K+, Mg+2, SO4−2, Cl−, pH, EC, and TDS (Mahmoudi et al., 2016). A deep learning predictive model was developed to model the DO levels of the reservoir (Banerjee et al., 2019). The least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS), and M5 model tree (M5Tree) were used to assess free ammonia (AMM), total Kjeldahl nitrogen (TKN), water temperature (WT), total coliform (TC), fecal coliform (FC), and pH (Kisi and Parmar, 2016). Comparing the overall results indicated that the MARS and LSSVM models had better performance in the accuracy than other methods. More recently, artificial neural network (ANN) has been widely applied for surface water quality modeling. For instance, have conducted various studies on the DO concentration modeling using different AI based models developed two nonlinear predictive models, namely modified response surface method (MRSM) and multilayer perceptron neural network (MLPNN) to model the daily dissolved oxygen concentration Applied three data-driven methods such as ANN, ANFIS, SVM, ARIMA and three different ensemble techniques i.e., simple average ensemble (SAE), weighted average ensemble (WAE) and neural network ensemble (NNE) to forecast single and multi-step ahead modeling of DO at river water. Provided a model to assess the DO of water using the polynomial chaos expansion approach. Further, some researchers focused on different applicable aspects of ANN methods to forecast the water quality indexes.
Recently, hybrid discrete wavelet transform (DWT) and robust artificial intelligence have been used to estimate water quality parameters based on limited time series data. In particular, the hybrid wavelet-artificial neural network (WANN) techniques have been applied to predict the EC of river water (Ravansalar and Rajaee, 2015). Specifically, the original time series of monthly EC and discharge (Q) values were decomposed by using the DWT and then coupled with the ANN model. The results indicated that the WANN model enhanced the modeling accuracy. Montaseri et al.(Montaseri et al., 2018) utilized wavelet-ANFIS, wavelet-GEP, and other hybrid AI models to forecast TDS based on EC, Na+, Cl− under various climatic conditions. Rajaee et al. (Rajaee et al., 2018) applied wavelet-multiple linear regression (WMLR) and WANN to predict daily pH levels with the original time series of pH and discharge Q. Barzegar et al. (Barzegar et al., 2016) used ANN, ANFIS, wavelet-ANN, and wavelet-ANFIS to assess water salinity levels based on different subsets of monthly Ca+2, Mg+2, Na+, SO4−2, and Cl− in rivers. Barzegar et al. (Barzegar et al., 2018) simulated multi-step-ahead EC by a hybrid wavelet-extreme learning machine (WA-ELM) model and compared with an adaptive neuro-fuzzy inference system (ANFIS).
The aim of this study is to investigate the influence of the DWT in the LWLR method to predict monthly EC levels in the Sefidrud River, Iran based on multiple hydrological data (including EC and discharge). To the best of our knowledge, the hybrid wavelet-locally weighted linear regression (W-LWLR) method has not been utilized before to forecast the water quality parameters. Also, few studies have been conducted to use the LWLR as a data-driven method (Ahmadianfar et al., 2019; Jamei and Ahmadianfar, 2019). In this research, the W-LWLR, W-SVR, W-ARIMA, and W-MLR models are developed to assess monthly EC levels and the results are compared with those from the original LWLR, SVR, ARIMA, and MLR models.
Section snippets
Support vector machine
The SVM is a machine learning method based on the theory of statistical learning and the structural risk minimization principle, which was firstly introduced by Cortes and Vapnik (Vapnik and Cortes, 1995). SVM has been widely applied for classification and regression, which usually outperforms traditional methods used in the previous studies (Huang et al., 2002; Sarzaeim et al., 2017). The SVR is the utilization of SVM for regression. Various types of kernel function such as exponential radial
Results and discussion
Tables 3–6 show the statistical metrics obtained for the LWLR, SVR, ARIMA, and MLR models, respectively. As indicated in these tables, for the LWLR combination 3 is the best one, which contains ECt−1, ECt−2, ECt−3, Qt, Qt−1, and Qt−2 with an average rank of 2.335 (2 and 2.67 for the training and testing phases, respectively). The best input combination for the SVR is combination 3 with an average rank of 3 (2 and 4 for the training and testing phases, respectively). For the ARIMA model, the
Conclusions
In this research, for the first time, the LWLR model was developed to predict the water quality parameters. Particularly, the LWLR was coupled with the discrete wavelet transform to forecast the monthly EC levels of surface water. In order to evaluate the performance and the ability of the models, they were compared with the SVR, W-SVR, ARIMA, W-ARIMA, MLR, and W-MLR models. In the model development, 240 observed monthly river discharge and water EC sample data from the Astane Station were
Funding
This study was funded by Vice-Chancellor for Research and Technology, Shohadaye Hoveizeh University of Technology, (project code. IR-Civ-SHHUT97-200-3).
Declaration of Competing Interest
We are the authors and confirm that there is no conflict of interest.
References (51)
- et al.
Environmental factors as indicators of dissolved oxygen concentration and zooplankton abundance: deep learning versus traditional regression approach
Ecol. Indic.
(2019) - et al.
Use of principal component scores in multiple linear regression models for prediction of Chlorophyll-a in reservoirs
Ecol. Model.
(2005) - et al.
A novel hybrid water quality time series prediction method based on cloud model and fuzzy forecasting
Chemom. Intell. Lab. Syst.
(2015) - et al.
Stepwise multiple regression method to forecast fish landing
Procedia Soc. Behav. Sci.
(2010) - et al.
A wavelet-support vector machine conjunction model for monthly streamflow forecasting
J. Hydrol.
(2011) - et al.
Precipitation forecasting by using wavelet-support vector machine conjunction model
Eng. Appl. Artif. Intell.
(2012) - et al.
Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution
J. Hydrol.
(2016) - et al.
Long-term evaluation of water quality parameters of the Karoun River using a regression approach and the adaptive neuro-fuzzy inference system
Mar. Pollut. Bull.
(2018) - et al.
River flow forecasting through conceptual models part I—A discussion of principles
J. Hydrol.
(1970) - et al.
A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation
Eng. Appl. Artif. Intell.
(2009)
Applications of hybrid wavelet–artificial intelligence models in hydrology: a review
J. Hydrol.
A wavelet–linear genetic programming model for sodium (Na+) concentration forecasting in rivers
J. Hydrol.
A review on the applications of wavelet transform in hydrology time series analysis
Atmos. Res.
Multiple linear regression modeling for compositional data
Neurocomputing
Locally weighted linear regression for cross-lingual valence-arousal prediction of affective words
Neurocomputing
Prediction of local scour around circular piles under waves using a novel artificial intelligence approach
Mar. Georesour. Geotechnol.
Locally Weighted Learning for Control
Prediction of water quality parameters using ANFIS optimized by intelligence algorithms (case study: Gorganrood River)
KSCE J. Civ. Eng.
Application of wavelet-artificial intelligence hybrid models for water quality prediction: a case study in Aji-Chay River, Iran
Stoch. Env. Res. Risk A.
Multi-step water quality forecasting using a boosting ensemble multi-wavelet extreme learning machine model
Stoch. Env. Res. Risk A.
Comparison of wavelet-based ANN and regression models for reservoir inflow forecasting
J. Hydrol. Eng.
Root mean square error (RMSE) or mean absolute error (MAE)?–arguments against avoiding RMSE in the literature
Geosci. Model Dev.
Prediction of bromate formation using multi-linear regression and artificial neural networks
Ozone Sci. Eng.
A hybrid neural network and ARIMA model for water quality time series prediction
Engineering applications of artificial intelligence
Development of a neural network technique for prediction of water quality parameters in the Delaware River, Pennsylvania
Middle-East J. Sci. Res.
Cited by (57)
Battery pack capacity estimation for electric vehicles based on enhanced machine learning and field data
2024, Journal of Energy ChemistryDevelopment of a basin-scale total nitrogen prediction model by integrating clustering and regression methods
2024, Science of the Total EnvironmentBidirectional Long Short-Term Memory (BILSTM) - Support Vector Machine: A new machine learning model for predicting water quality parameters
2024, Ain Shams Engineering JournalHigh-frequency data significantly enhances the prediction ability of point and interval estimation
2024, Science of the Total EnvironmentModel-based assessment of flood generation mechanisms over Poland: The roles of precipitation, snowmelt, and soil moisture excess
2023, Science of the Total EnvironmentInverse groundwater salinization modeling in a sandstone's aquifer using stand-alone models with an improved non-linear ensemble machine learning technique
2022, Journal of King Saud University - Computer and Information SciencesCitation Excerpt :Similarly, Tutmez et al., (2006) simulated the EC in the Southern part of Turkey using the ANFIS model, and their finding (R2 = 0.97) were less accurate than our study. We also compared our best standalone model with that of Ahmadianfar et al., (2020) and found that his best goodness-of-fit was 0.97, indicating our work's superiority. The implementation of single data intelligent algorithms has received remarkable attention but, in several instances, was found less accurate owing to various reasons.