当前位置: X-MOL 学术Ocean Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting Lake Erie wave heights and periods using XGBoost and LSTM
Ocean Modelling ( IF 3.2 ) Pub Date : 2021-06-08 , DOI: 10.1016/j.ocemod.2021.101832
Haoguo Hu , André J. van der Westhuysen , Philip Chu , Ayumi Fujisaki-Manome

Waves in large lakes put coastal communities and vessels under threat, and accurate wave predictions are needed for early warnings. While physics-based numerical wave models such as WAVEWATCH III (WW3) are useful to provide spatial information to supplement in situ observations, they require intensive computational resources. An attractive alternative is machine learning (ML) methods, which can potentially improve the performance of numerical wave models, while only requiring a small fraction of the computational cost. In this study, we applied novel ML methods based on XGBoost and a Long Short-Term Memory (LSTM) recurrent neural network for predicting wave height and period under the near-idealized wave growth conditions of Lake Erie. Data sets of significant wave height (H), peak wave period (Tp) and surface wind from two offshore buoys from 1994 to 2017 were processed for model training and testing. We trained and validated the ML models with the data sets from 1994 to 2015, and then used the trained models to predict significant wave height and peak period for 2016 and 2017. The XGBoost model yielded the best overall performance, with Mean Absolute Percentage Error (MAPE) values of 15.6%–22.9% in H and 8.3%–13.4% in Tp. The LSTM model yielded MAPE values of 23.4%–30.8% in H and 9.1%–13.6% in Tp. An unstructured grid WW3 applied to Lake Erie yielded MAPE values of 15.3%–21.0% in H and 12.5%–19.3% in Tp. However, WW3 underestimated H and Tp during strong wind events, with relative biases of -11.76% to -14.15% in H and -15.59% to -19.68% in Tp. XGBoost and LSTM improve on these predictions with relative biases of -2.56% to -10.61% in H and -8.08% to -10.13% in Tp. An ensemble mean of these three models yielded lower scatter scores than the members, with MAPE values of 13.3%–17.3% in H and 8.0%–13.0% in Tp, although it did not improve the bias. The ML models ran significantly faster than WW3: For this 2-year run on the same computing environment, WW3 needed 24 h with 60 CPUs, whereas the trained LSTM needed 0.24 s on 1 CPU, and the trained XGBoost needed only 0.03 s on 1 CPU.



中文翻译:

使用 XGBoost 和 LSTM 预测伊利湖的波浪高度和周期

大型湖泊中的海浪使沿海社区和船只受到威胁,需要准确的海浪预测以进行早期预警。虽然 WAVEWATCH III (WW3) 等基于物理的数值波浪模型可用于提供空间信息以补充原位观测,但它们需要大量的计算资源。一个有吸引力的替代方法是机器学习 (ML) 方法,它可以潜在地提高数值波浪模型的性能,同时只需要一小部分计算成本。在这项研究中,我们应用了基于 XGBoost 和长短期记忆 (LSTM) 递归神经网络的新型 ML 方法,用于预测伊利湖近乎理想化的波浪生长条件下的波浪高度和周期。有效波高数据集(H), 峰值波周期 () 和 1994 年至 2017 年两个海上浮标的表面风进行模型训练和测试。我们使用 1994 年至 2015 年的数据集训练和验证 ML 模型,然后使用训练后的模型预测 2016 年和 2017 年的显着波高和高峰期。 XGBoost 模型产生了最佳的整体性能,平均绝对百分比误差( MAPE) 值为 15.6%–22.9%H 和 8.3%–13.4% . LSTM 模型产生了 23.4%–30.8% 的 MAPE 值H 和 9.1%–13.6% . 应用于伊利湖的非结构化网格 WW3 产生了 15.3%–21.0% 的 MAPE 值H 和 12.5%–19.3% . 然而,二战低估了H 在强风事件期间,相对偏差为 -11.76% 至 -14.15% H 和 -15.59% 至 -19.68% . XGBoost 和 LSTM 改进了这些预测,相对偏差为 -2.56% 到 -10.61%H 和 -8.08% 至 -10.13% . 这三个模型的集合平均值产生的散点分数低于成员,MAPE 值为 13.3%–17.3%H 和 8.0%–13.0% ,虽然它没有改善偏差。ML 模型的运行速度明显快于 WW3:对于在相同计算环境下运行的这 2 年,WW3 在 60 个 CPU 上需要 24 小时,而经过训练的 LSTM 在 1 个 CPU 上需要 0.24 秒,而经过训练的 XGBoost 在 1 个 CPU 上只需要 0.03 秒中央处理器。

更新日期:2021-06-24
down
wechat
bug