当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multimodal deep learning for cetacean distribution modeling of fin whales (Balaenoptera physalus) in the western Mediterranean Sea
Machine Learning ( IF 7.5 ) Pub Date : 2021-07-26 , DOI: 10.1007/s10994-021-06029-z
D. Cazau 1 , P. Nguyen Hong Duc 2 , J.-N. Druon 3 , S. Matwins 4 , R. Fablet 5
Affiliation  

Cetacean Distribution Modeling (CDM) is used to quantify mobile marine species distributions and densities. It is essential to better understand and protect whales and their relatives. Current CDM approaches often fail in capturing general species-environment relationships, which would be valid within a broader range of environmental conditions that characterize the surveyed regions. This paper aims at investigating the usefulness of deep learning based schemes, namely multi-task and transfer learning, in CDM. Co-training of a stochastic presence-background model on a classification task and a deterministic rule-based model on a regression task was performed. Whale presence-only records were used for the first task, and index outputs of a feeding habitat occurrence model for the second one. This new approach has been experimented through the study case of fin whales in the western Mediterranean Sea. To evaluate our approach, a new metric called True Positive rate per unit of Surface Habitat (TPSH) and an original multimodal fully-connected neural networks were developed. A Generalized Additive Model (GAM)—a standard CDM method—was also used as a reference for performance. Results show that our multi-task learning model improves both the feeding habitat model by 10.8% and data-driven models such as GAM by 16.5% on our TPSH metric in relative terms, revealing a higher accuracy of our approach in estimating whale presence. Such trends in results have been further supported by the use of two other independent datasets that forced models to generalize beyond their training dataset of species-environment relationships. Performance could be further improved by adopting more optimal thresholds as observed from Receiver Operating Characteristic curves, e.g. the multi-task learning model could reach absolute gains up to 10% in the median of the True Positive Rate while maintaining its habitat spatial spreading. Globally, our work confirmed our working hypothesis that expert information on whale behaviour represent a good knowledge base for model generalization. This result can be further improved by a concurrent learning of more local species-environment relationships from in-situ presence data.



中文翻译:

地中海西部长须鲸(Balaenoptera physalus)鲸类分布建模的多模态深度学习

鲸类分布建模 (CDM) 用于量化移动海洋物种的分布和密度。更好地了解和保护鲸鱼及其亲属至关重要。当前的 CDM 方法通常无法捕捉一般的物种-环境关系,这在表征调查区域的更广泛的环境条件下是有效的。本文旨在研究基于深度学习的方案,即多任务和迁移学习,在 CDM 中的有用性。对分类任务上的随机存在背景模型和回归任务上的确定性基于规则的模型进行了联合训练。第一项任务仅使用鲸鱼存在记录,第二项任务使用觅食栖息地发生模型的索引输出。这种新方法已经通过地中海西部长须鲸的研究案例进行了实验。为了评估我们的方法,开发了一种称为每单位表面栖息地真阳性率 (TPSH) 的新指标和原始的多模态全连接神经网络。广义加性模型 (GAM)——一种标准的 CDM 方法——也被用作性能参考。结果表明,相对而言,我们的多任务学习模型将觅食栖息地模型和 GAM 等数据驱动模型的 TPSH 指标提高了 16.5%,这表明我们的方法在估计鲸鱼存在方面具有更高的准确性。结果中的这种趋势得到了使用其他两个独立数据集的进一步支持,这些数据集迫使模型在其物种-环境关系的训练数据集之外进行泛化。通过采用从接收器操作特征曲线中观察到的更优化阈值可以进一步提高性能,例如,多任务学习模型可以在保持其栖息地空间扩展的同时,在真阳性率的中值中达到高达 10% 的绝对增益。在全球范围内,我们的工作证实了我们的工作假设,即关于鲸鱼行为的专家信息代表了模型泛化的良好知识库。通过从原地存在数据中同时学习更多本地物种-环境关系,可以进一步改善这一结果。在全球范围内,我们的工作证实了我们的工作假设,即关于鲸鱼行为的专家信息代表了模型泛化的良好知识库。通过从原地存在数据中同时学习更多本地物种-环境关系,可以进一步改善这一结果。在全球范围内,我们的工作证实了我们的工作假设,即关于鲸鱼行为的专家信息代表了模型泛化的良好知识库。通过从原地存在数据中同时学习更多本地物种-环境关系,可以进一步改善这一结果。

更新日期:2021-07-27
down
wechat
bug