当前位置: X-MOL 学术Water Resour. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Challenges in Applying Machine Learning Models for Hydrological Inference: A Case Study for Flooding Events Across Germany
Water Resources Research ( IF 5.4 ) Pub Date : 2020-05-14 , DOI: 10.1029/2019wr025924
Lennart Schmidt 1, 2 , Falk Heße 2, 3 , Sabine Attinger 2, 3 , Rohini Kumar 2
Affiliation  

Machine learning (ML) algorithms are being increasingly used in Earth and Environmental modeling studies owing to the ever‐increasing availability of diverse data sets and computational resources as well as advancement in ML algorithms. Despite advances in their predictive accuracy, the usefulness of ML algorithms for inference remains elusive. In this study, we employ two popular ML algorithms, artificial neural networks and random forest, to analyze a large data set of flood events across Germany with the goals to analyze their predictive accuracy and their usability to provide insights to hydrologic system functioning. The results of the ML algorithms are contrasted against a parametric approach based on multiple linear regression. For analysis, we employ a model‐agnostic framework named Permuted Feature Importance to derive the influence of models' predictors. This allows us to compare the results of different algorithms for the first time in the context of hydrology. Our main findings are that (1) the ML models achieve higher prediction accuracy than linear regression, (2) the results reflect basic hydrological principles, but (3) further inference is hindered by the heterogeneity of results across algorithms. Thus, we conclude that the problem of equifinality as known from classical hydrological modeling also exists for ML and severely hampers its potential for inference. To account for the observed problems, we propose that when employing ML for inference, this should be made by using multiple algorithms and multiple methods, of which the latter should be embedded in a cross‐validation routine.

中文翻译:

应用机器学习模型进行水文推断的挑战:以德国各地的洪水事件为例

由于各种数据集和计算资源的可用性不断提高以及ML算法的发展,机器学习(ML)算法正越来越多地用于地球和环境建模研究。尽管其预测精度有所提高,但ML算法进行推理的实用性仍然难以捉摸。在这项研究中,我们采用两种流行的ML算法(人工神经网络和随机森林)来分析整个德国的洪水事件的大型数据集,目的是分析其洪水预报的准确性和可用性,从而为水文系统的运行提供见解。ML算法的结果与基于多重线性回归的参数方法进行了对比。为了分析,我们采用名为“置换特征重要性”的模型不可知框架,以得出模型预测变量的影响。这使我们能够在水文学背景下首次比较不同算法的结果。我们的主要发现是:(1)ML模型比线性回归具有更高的预测准确性,(2)结果反映了基本的水文原理,但(3)跨算法结果的异质性进一步阻碍了推理。因此,我们得出结论,对于经典语言,还存在经典水文模型中已知的均等性问题,并严重阻碍了它的推论潜力。为了解决观察到的问题,我们建议在使用ML进行推理时,应使用多种算法和多种方法来完成,
更新日期:2020-05-14
down
wechat
bug