Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance,Journal of Water Process Engineering

当前位置： X-MOL 学术 › J. Water Process. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance
Journal of Water Process Engineering ( IF 6.3 ) Pub Date : 2021-03-31 , DOI: 10.1016/j.jwpe.2021.102033
Faramarz Bagherzadeh , Mohamad-Javad Mehrani , Milad Basirifard , Javad Roostaei

Wastewater characteristics prediction in wastewater treatment plants (WWTPs) is valuable and can reduce the number of sampling, energy, and cost. Feature Selection (FS) methods are used in the pre-processing section for enhancing the model performance. This study aims to evaluate the effect of seven different FS methods (filter, wrapper, and embedded methods) on enhancing the prediction accuracy for total nitrogen (TN) in the WWTP influent flow. Four scenarios based on FS suggestions were defined and compared by three supervised Machine Learning (ML) algorithms, i.e. Artificial Neural Network (ANN), Random Forest (RF), and Gradient Boosting Machine (GBM). Input parameters, as daily time-series including pH, DO, COD, BOD, MLSS, MLVSS, NH₄-N, and TN concentration, were used. Data set divided into train and unseen test data-sets, and performance precision of all models was carried out based on Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and correlation coefficient (R²). Results reveal that scenario IV which was suggested by Mutual Information, including NH₄-N, COD, BOD, and DO had the best result rather than other FS methods. Furthermore, decision tree algorithms (RF and GBM) revealed better performance results in comparison to neural network algorithm (ANN). GBM generalized the dataset patterns very well and produced the best performance on unseen data-set, which shows the effectiveness of this state-of-the-art ML algorithm for wastewater components prediction.

中文翻译：

废水处理厂总氮预测的比较研究以及各种特征选择方法对机器学习算法性能的影响

废水处理厂（WWTP）中的废水特性预测非常有价值，可以减少采样数量，能源和成本。预处理部分中使用了功能选择（FS）方法来增强模型性能。这项研究旨在评估七种不同的FS方法（过滤器，包装器和嵌入式方法）对提高WWTP进水流中总氮（TN）的预测精度的影响。定义了基于FS建议的四个方案，并通过三种监督的机器学习（ML）算法进行了比较，即人工神经网络（ANN），随机森林（RF）和梯度提升机（GBM）。输入参数，如每日时间序列，包括pH，DO，COD，BOD，MLSS，MLVSS，NH ₄使用了-N和TN浓度。数据集分为训练数据集和看不见的测试数据集，所有模型的性能精度均基于均方根误差（RMSE），均值绝对误差（MAE）和相关系数（R ²）进行。结果表明，互助信息系统建议的情景IV（包括NH ₄ -N，COD，BOD和DO）具有最好的结果，而不是其他FS方法。此外，与神经网络算法（ANN）相比，决策树算法（RF和GBM）表现出更好的性能结果。GBM很好地泛化了数据集模式，并在看不见的数据集上产生了最佳性能，这表明了这种最新的ML算法在废水成分预测中的有效性。

更新日期：2021-03-31

点击分享查看原文

点击收藏

阅读更多本刊最新论文