当前位置: X-MOL 学术Boundary-Layer Meteorol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Machine-Learning Methods to Improve Surface Wind Speed from the Outputs of a Numerical Weather Prediction Model
Boundary-Layer Meteorology ( IF 4.3 ) Pub Date : 2021-01-14 , DOI: 10.1007/s10546-020-00586-x
Naveen Goutham , Bastien Alonzo , Aurore Dupré , Riwal Plougonven , Rebeca Doctors , Lishan Liao , Mathilde Mougeot , Aurélie Fischer , Philippe Drobinski

The relationship between the wind speed derived from the outputs of a numerical-weather-prediction model and from observations is explored using statistical and machine-learning models. Eight years of wind-speed measurements at a height of 10 m (from 2010 to 2017) from 171 stations spread over mainland France and Corsica are used for reference. Operational analyses from the European Center for Medium Range Weather Forecasts (ECMWF) provide the model information not only on the surface flow, but on other aspects of the atmospheric state at the location (or above) each station. In a first step, a large number of explanatory variables are used as input to several models (linear regressions, k -nearest neighbours, random forests, and gradient boosting). The modelled wind speed in the ECMWF analyses, by itself, has root-mean-square errors over all stations distributed widely around a median of 1.42 m s $$^{-1}$$ - 1 . Using statistical post-processing and making use of a historical dataset for training, the median of the root-mean-square errors at all stations can be reduced down to 1.07 m s $$^{-1}$$ - 1 when modelled with linear regressions, and down to 0.94 m s $$^{-1}$$ - 1 with the machine-learning models (random forests or gradient boosting). Yet more significant decreases are found for coastal stations where the errors are largest. The random-forest models are further explored to reduce the list of explanatory variables: a list of 25 explanatory variables, mainly consisting of flow variables (wind speed, velocity components, horizontal gradients of geopotential on different isobaric surfaces, wind shear between 10 and 100 m) and including marginally some temperature variables, appears as a good compromise between performance and simplicity. Finally, as a preliminary test for further work, the relation thus captured between the model outputs and the observed wind speed at a given time is applied to forecasts of the numerical-weather-prediction model, for lead times up to 24 h. The machine-learning model is found to be essentially as relevant on the forecasts as it was on the analyses, encouraging further use and development of these approaches for local wind-speed forecasts.

中文翻译:

使用机器学习方法从数值天气预报模型的输出中提高地表风速

使用统计和机器学习模型探索了从数值天气预测模型的输出和观测得出的风速之间的关系。使用分布在法国大陆和科西嘉岛的 171 个站点在 10 m 高度(2010 年至 2017 年)进行的八年风速测量作为参考。欧洲中期天气预报中心 (ECMWF) 的业务分析不仅提供有关地表流量的模型信息,还提供有关每个站位置(或上方)大气状态其他方面的模型信息。第一步,将大量解释变量用作多个模型(线性回归、k 最近邻、随机森林和梯度提升)的输入。ECMWF 分析中模拟的风速本身,所有站点的均方根误差广泛分布在 1.42 m s $$^{-1}$$ - 1 的中位数附近。使用统计后处理并利用历史数据集进行训练,当使用线性建模时,所有站点的均方根误差的中位数可以降低到 1.07 ms $$^{-1}$$ - 1回归,使用机器学习模型(随机森林或梯度提升)降低到 0.94 毫秒 $$^{-1}$$ - 1。然而,在误差最大的沿海站发现了更显着的下降。进一步探索随机森林模型以减少解释变量列表:25 个解释变量列表,主要由流量变量(风速、速度分量、不同等压面的位势水平梯度、10 到 100 m 之间的风切变)并包括一些温度变量,似乎是性能和简单性之间的良好折衷。最后,作为进一步工作的初步测试,模型输出与给定时间观测到的风速之间的关系被应用于数值天气预测模型的预测,提前期长达 24 小时。发现机器学习模型基本上与预测和分析一样相关,鼓励进一步使用和开发这些方法用于当地风速预测。模型输出与给定时间观测到的风速之间的这种关系被应用于数值天气预测模型的预测,提前期长达 24 小时。发现机器学习模型基本上与预测和分析一样相关,鼓励进一步使用和开发这些方法用于当地风速预测。模型输出与给定时间观测到的风速之间的这种关系被应用于数值天气预测模型的预测,提前期长达 24 小时。发现机器学习模型基本上与预测和分析一样相关,鼓励进一步使用和开发这些方法用于当地风速预测。
更新日期:2021-01-14
down
wechat
bug