Skip to main content
Log in

Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: integration of remote sensing and data-driven models

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Rivers, as one of the freshwater resources, are generally put in the state of jeopardy in terms of quantity and quality due to the development in industry, agriculture, and urbanization. Management of water quality is inextricably bound up with a reliable prediction of the Water Quality Index (WQI) for various purposes. In this way, an accurate estimation of WQI is one of the most challenging issues in the water quality studies of surface water resources. There is a board range of traditional methodologies for the WQI evaluation. Due to the intrinsic limitations of conventional models, Data-Driven Models (DDMs) have been frequently employed to assess the WQI for natural streams. In the present research, WQI values and their typical classifications were obtained by guidelines of the National Sanitation Foundation (NSF). Hence, four well-known DDMs such as Evolutionary Polynomial Regression (EPR), M5 Model Tree (MT), Gene-Expression Programming (GEP), and Multivariate Adaptive Regression Spline (MARS) are employed to predict WQI in Karun River. In this way, 12 Water Quality Parameters (i.e., Dissolved Oxygen, Chemical Oxygen Demand, Biochemical Oxygen Demand, Electrical Conductivity, Nitrate, Nitrite, Phosphate, Turbidity, pH, Calcium, Magnesium, and Sodium) were accumulated from nine hydrometry stations and additionally missing values of water temperature were extracted from images analysis of Landsat-7 ETM+. Furthermore, the Gamma Test (GT), Forward Selection (FS), Polynomial Chaotic Expression (PCE), and Principle Component Analysis (PCA) were used to reduce the volume of DDMs-feeding-input variables. Results of DDMs demonstrated that FS-M5 MT had the best performance for the estimation of WQI classification. WQI values for Karun River were assessed in the reliability-based probabilistic framework to consider the effect of any uncertainty and randomness in the input parameters. To this end, the Monte-Carlo scenario sampling technique was conducted to evaluate the limit state function from the DDMs-based-WQI formulation. Based on the qualitative description of the WQI, it was observed that the WQI of Karun River is classified into “Relatively Bad” quality. Moreover, based on the reliability analysis, there is only a 19% chance exists for a specimen from Karun River to have a better quality index.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Abbreviations

AI:

Artificial Intelligence

ANFIS:

Adaptive Neuro-Fuzzy Inference System

A:

Slope parameter returned as this normally includes useful data related to the complexity among input and output variables

aij:

Weighting coefficients of principle components

AL:

Offset which has a certain value for each band

ANNs:

Artificial Neural Networks

APLR:

Adaptive Piecewise Linear Regression

b1, b2, b3, b4:

Weighting coefficients of multivariate linear equation by MT

BFs:

Basis Functions

BOD:

Biochemical Oxygen Demand

C:

A closed bounded set

c1, c2, c3,… c13:

Weighting coefficients of multivariate linear equation by MT

Ca2+ :

Calcium

CCME:

Canadian Council of Ministers of the Environment

CL:

Confidence Level

COD:

Chemical Oxygen Demand

DDMs:

Data-Driven Models

DN:

Digital Number

Dn:

Absolute difference between numerical and theoretical accumulative distribution associated with the parameter

\(D_{n}^{u}\) :

Acceptable limit for the Dn

DO:

Dissolved Oxygen

DOsat:

Dissolved Oxygen in the saturated state

e:

Model error known as the uncertainty parameter

EC:

Electrical Conductivity

EPR:

Evolutionary Polynomial Regression

ET:

Expressions Tree

ETM+ :

Enhanced Thematic Mapper Plus

F0:

F ratio

FC:

Fecal Coliform

FORM:

First Order Reliability Method

FOSM:

First-Order Second Moment

FS:

Forward Selection

FX(r):

Theoretical cumulative distribution associated with r parameter

G:

Green Spectral Band

GA:

Genetic Algorithm

GEP:

Gene-Expression Programming

GMDH:

Group Method of Data Handling

Gn(r):

Numerical cumulative distribution associated with r parameter

GP:

Genetic Programming

GT:

Gamma Test

GT0:

The intercept on the vertical axis (δ = 0)

h:

A function for establishing a relationship among WQPs and WQI

I:

Unit matrix

IOA:

Index of Agreement

k:

Number of the nearest neighbors

k′:

Number of elements in input variables

K1, K2:

Band-specific thermal conservation constant

KMO:

Kaiser–Meyer-Olkin

K-S:

The Kolmogorov–Smirnov

LS:

Least Squares

LSF:

Limit-State Function

Lλ:

Top of Atmospheric Radiance

M:

Maximum number of mathematical terms

MAE:

Mean Absolute Error

MARS:

Multivariate Adaptive Regression Spline

Mg2+ :

Magnesium

ML:

Gain coefficient

MMSE:

Minimum Mean Square Error

MOGA:

Multi-Objective Genetic Algorithm

MSE:

Mean Squared Error

MT:

Model Tree

n:

Number of input variable

n′:

Number of observations

Na + :

Sodium

NDWI:

Normalized Difference Water Index

NH4:

Ammonium

NIR:

Near Infra-Red Spectral Band

\(NO_{3}^{ - }\) :

Nitrate Nitrogen

NSF:

National Sanitation Foundation

p:

Maximum number of input variables

PCA:

Principle Component Analysis

PCC:

Positive Coefficient of Correlation

PCE:

Polynomial Chaotic Expression

PE:

Probability of Exceedance

pf:

Probability of Failure

pH:

Potential of Hydrogen

\(PO_{4}^{3 - }\) :

Phosphate

Qcal:

Value of DN

R:

Coefficient of correlation

RAE:

Relative Absolute Error

RE:

Relative Error

RMSE:

Root Mean Square Error

ROI:

Region of Interest

RRSE:

Root Relative Squared Error

s:

Number of basis functions

SORM:

Second-Order Reliability Method

SSE:

Sum Square Error

SST:

Sea Surface Temperature

SVM:

Support Vector Machine

T:

Temperature

TB:

Brightness Temperature

TH:

Total Hardness

Tu:

Turbidity

u:

Significance level

USGS:

United State Geographical Survey

VCM:

Variance–covariance matrix

WCs:

Weighting Coefficients

WQI:

Water Quality Index

WQIAC:

Acceptable values of WQI

WQIME:

Measured values of WQI

WQPs:

Water Quality Parameters

WST:

Water Surface Temperature

x:

Input vectors known as WQPs

X1, X2, X3,… Xn:

Input variables associated with the limit state function

y:

Output vector known as WQI

α:

The significant level used in F test

δ:

The function associated with the Euclidean distance

δ′:

A collection of coefficients used in EPR formulation

θ:

Input variables vector for a specific problem

λ:

Eigenvalues

μ(x):

Basis function

π:

Overall formulation by EPR

ρ:

Weighting coefficients used in formulation obtained by MARS model

ϕ(x):

Formulation obtained by MARS model

ϕ1, ϕ2, ϕ3:

Functions for establishing a relationship among WQPs

ω:

User-defined-function with various mathematical structure

\(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\thicksim}$}}{x}\) :

Vector of random variables

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Najafzadeh.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 1112 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Najafzadeh, M., Homaei, F. & Farhadi, H. Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: integration of remote sensing and data-driven models. Artif Intell Rev 54, 4619–4651 (2021). https://doi.org/10.1007/s10462-021-10007-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-021-10007-1

Keywords

Navigation