The proper care and feeding of CAMELS: How limited training data affects streamflow prediction,Environmental Modelling & Software

当前位置： X-MOL 学术 › Environ. Model. Softw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The proper care and feeding of CAMELS: How limited training data affects streamflow prediction
Environmental Modelling & Software ( IF 4.8 ) Pub Date : 2020-11-13 , DOI: 10.1016/j.envsoft.2020.104926
Martin Gauch , Juliane Mai , Jimmy Lin

Accurate streamflow prediction largely relies on historical meteorological records and streamflow measurements. For many regions, however, such data are only scarcely available. Facing this problem, many studies simply trained their machine learning models on the region's available data, leaving possible repercussions of this strategy unclear. In this study, we evaluate the sensitivity of tree- and LSTM-based models to limited training data, both in terms of geographic diversity and different time spans. We feed the models meteorological observations disseminated with the CAMELS dataset, and individually restrict the training period length, number of training basins, and input sequence length. We quantify how additional training data improve predictions and how many previous days of forcings we should feed the models to obtain best predictions for each training set size. Further, our findings show that tree- and LSTM-based models provide similarly accurate predictions on small datasets, while LSTMs are superior given more training data.

中文翻译：

CAMELS的正确维护和喂养：有限的培训数据如何影响流量预测

准确的流量预测在很大程度上取决于历史气象记录和流量测量。但是，对于许多地区而言，此类数据很少可用。面对这个问题，许多研究只是在该地区的可用数据上训练了他们的机器学习模型，而对该策略的可能影响尚不清楚。在这项研究中，我们评估了基于树和LSTM的模型对有限训练数据的敏感性，包括地理多样性和不同的时间跨度。我们用CAMELS数据集提供模型气象观测资料，并分别限制训练周期长度，训练盆地数量和输入序列长度。我们量化其他训练数据如何改善预测，以及我们应该将模型的前几天强制输入模型以获得每种训练集大小的最佳预测。此外，我们的发现表明，基于树和LSTM的模型可在小型数据集上提供相似的准确预测，而LSTM在提供更多训练数据的情况下更为出色。

更新日期：2020-11-21

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11