当前位置: X-MOL 学术Ecol. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benefits of machine learning and sampling frequency on phytoplankton bloom forecasts in coastal areas
Ecological Informatics ( IF 5.8 ) Pub Date : 2020-09-28 , DOI: 10.1016/j.ecoinf.2020.101174
Jonathan Derot , Hiroshi Yajima , François G. Schmitt

In aquatic ecosystems, anthropogenic activities disrupt nutrient fluxes, thereby promoting harmful algal blooms that could directly impact economies and human health. Within this framework, the forecasting of the proxy of chlorophyll a in coastal areas is the first step to managing these algal blooms. The primary goal was to analyze how phytoplankton bloom forecasts are impacted by different sampling frequencies, by using a machine learning model. The database used in this study was sourced from an automated system located in the English Channel. This device has a sampling frequency of 20 min. We considered 12 physicochemical parameters over a six-year period. Our forecast methodology is based on the random forest (RF) model and a sliding window strategy. The lag times for these sliding windows ranged from 12 h to 3 months with four different sampling times until 1 day.

The results indicate that the optimal forecast was obtained for a 20 min time step, with an average R2 of 0.62. Moreover, the highest values of fluorescence were predicted when the water temperature was approximately 11.8 °C. Consequently, we demonstrated that the sampling frequency directly impacts the forecast performance of an RF model. Furthermore, this kind of model can recreate interactions that closely resemble biological processes. Our study suggests that the RF model can utilize the additional information contained in high-frequency datasets. The methodology presented here lays the foundation for the development of a numerical decision-making tool that could help mitigate the impact of these algal blooms.



中文翻译:

机器学习和采样频率对沿海地区浮游植物开花预测的好处

在水生生态系统中,人为活动破坏了养分通量,从而促进了可能直接影响经济和人类健康的有害藻华。在此框架内,叶绿素a替代物的预测在沿海地区,这是管理这些藻华的第一步。主要目标是通过使用机器学习模型来分析浮游植物开花预测如何受到不同采样频率的影响。本研究中使用的数据库来自英吉利海峡的自动系统。该设备的采样频率为20分钟。我们在六年中考虑了12个物理化学参数。我们的预测方法基于随机森林(RF)模型和滑动窗口策略。这些滑动窗口的滞后时间从12小时到3个月不等,有四个不同的采样时间直到1天。

结果表明,最佳预测是在20分钟的时间步长内获得的,平均R 2为0.62。此外,当水温约为11.8°C时,可以预测出最高的荧光值。因此,我们证明了采样频率直接影响RF模型的预测性能。此外,这种模型可以重建与生物学过程非常相似的相互作用。我们的研究表明,RF模型可以利用高频数据集中包含的其他信息。此处介绍的方法为开发数字决策工具奠定了基础,该工具可以帮助减轻这些藻华的影响。

更新日期:2020-10-13
down
wechat
bug