Detecting, classifying, and counting blue whale calls with Siamese neural networks,The Journal of the Acoustical Society of America

当前位置： X-MOL 学术 › J. Acoust. Soc. Am. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Detecting, classifying, and counting blue whale calls with Siamese neural networks
The Journal of the Acoustical Society of America ( IF 2.4 ) Pub Date : 2021-05-06 , DOI: 10.1121/10.0004828
Ming Zhong ₁ , Maelle Torterotot ₂ , Trevor A Branch ₃ , Kathleen M Stafford ₄ , Jean-Yves Royer ₂ , Rahul Dodhia ₁ , Juan Lavista Ferres ₁

Affiliation

The goal of this project is to use acoustic signatures to detect, classify, and count the calls of four acoustic populations of blue whales so that, ultimately, the conservation status of each population can be better assessed. We used manual annotations from 350 h of audio recordings from the underwater hydrophones in the Indian Ocean to build a deep learning model to detect, classify, and count the calls from four acoustic song types. The method we used was Siamese neural networks (SNN), a class of neural network architectures that are used to find the similarity of the inputs by comparing their feature vectors, finding that they outperformed the more widely used convolutional neural networks (CNN). Specifically, the SNN outperform a CNN with 2% accuracy improvement in population classification and 1.7%–6.4% accuracy improvement in call count estimation for each blue whale population. In addition, even though we treat the call count estimation problem as a classification task and encode the number of calls in each spectrogram as a categorical variable, SNN surprisingly learned the ordinal relationship among them. SNN are robust and are shown here to be an effective way to automatically mine large acoustic datasets for blue whale calls.

中文翻译：

使用 Siamese 神经网络检测、分类和计数蓝鲸叫声

该项目的目标是使用声学特征来检测、分类和统计四个蓝鲸声学种群的叫声，以便最终更好地评估每个种群的保护状况。我们使用来自印度洋水下水听器的 350 小时录音的手动注释来构建深度学习模型，以检测、分类和计数来自四种声学歌曲类型的呼叫。我们使用的方法是连体神经网络 (SNN)，这是一类神经网络架构，用于通过比较输入的特征向量来寻找输入的相似性，发现它们的性能优于使用更广泛的卷积神经网络 (CNN)。具体来说，SNN 的性能优于 CNN，在人口分类方面的准确度提高了 2%，提高了 1.7%–6。每个蓝鲸种群的呼叫计数估计准确度提高了 4%。此外，尽管我们将调用计数估计问题视为分类任务并将每个频谱图中的调用数量编码为分类变量，但 SNN 出人意料地学习了它们之间的序数关系。SNN 是稳健的，此处显示为自动挖掘蓝鲸叫声的大型声学数据集的有效方法。

更新日期：2021-05-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>