On loss functions and CNNs for improved bioacoustic signal classification,Ecological Informatics

当前位置： X-MOL 学术 › Ecol. Inform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On loss functions and CNNs for improved bioacoustic signal classification
Ecological Informatics ( IF 5.8 ) Pub Date : 2021-05-29 , DOI: 10.1016/j.ecoinf.2021.101331
Jie Xie , Kai Hu , Ya Guo , Qibin Zhu , Jinghu Yu

Frog population has been experiencing rapid decreases worldwide, which is regarded as one of the most critical threats to global biodiversity. Therefore, large volumes of frog recordings have been collected for assessing this decline. Most previous studies focused on the syllable segmentation based frog call classification, which is sensitive to the background noise. Our recent study has used 1D-CNN and cross-entropy loss for frog call classification. However, the use of 1D-CNN is sensitive to the background noise and the imbalance of the number of different frog species is not considered. Therefore, this study aims to investigate loss functions and imbalance learning for bioacoustic signal classification in continuous recordings. Specifically, four types of loss functions are compared including cross-entropy loss, weighted cross-entropy loss, focal loss, and twin loss, which are combined with three CNN architectures: 1D-CNN, 2D-CNN, and 1D-2D-CNN. In addition, random oversampling is used for improving the classification performance of two imbalanced datasets. Experimental results show that (1) 1D-2D-CNN model can achieve the performance for classifying both Australia and Brazil frog calls. (2) Focal loss is the best among four loss functions for classifying low SNR recordings. (3) The highest macro F1-score for classifying both Australia and Brazil frog recordings are 89.26%±0.36% and 93.47%±0.28%.

中文翻译：

用于改进生物声学信号分类的损失函数和 CNN

青蛙数量在全球范围内迅速减少，这被认为是对全球生物多样性最严重的威胁之一。因此，已经收集了大量青蛙记录来评估这种下降。以前的大多数研究都集中在基于音节分割的青蛙叫声分类上，它对背景噪声很敏感。我们最近的研究使用 1D-CNN 和交叉熵损失进行青蛙叫声分类。然而，1D-CNN 的使用对背景噪声很敏感，并且没有考虑不同青蛙物种数量的不平衡。因此，本研究旨在研究连续记录中生物声学信号分类的损失函数和不平衡学习。具体来说，比较了四种损失函数，包括交叉熵损失、加权交叉熵损失、焦点损失和孪生损失，它们与三种 CNN 架构相结合：1D-CNN、2D-CNN 和 1D-2D-CNN。此外，随机过采样用于提高两个不平衡数据集的分类性能。实验结果表明，(1) 1D-2D-CNN 模型可以达到对澳大利亚和巴西青蛙叫声进行分类的性能。(2) 在用于分类低 SNR 记录的四种损失函数中，焦点损失是最好的。(3) 澳大利亚和巴西青蛙记录分类的最高宏观 F1 分数分别为 89.26%±0.36% 和 93.47%±0.28%。实验结果表明，(1) 1D-2D-CNN 模型可以达到对澳大利亚和巴西青蛙叫声进行分类的性能。(2) 在用于分类低 SNR 记录的四种损失函数中，焦点损失是最好的。(3) 澳大利亚和巴西青蛙记录分类的最高宏观 F1 分数分别为 89.26%±0.36% 和 93.47%±0.28%。实验结果表明，(1) 1D-2D-CNN 模型可以达到对澳大利亚和巴西青蛙叫声进行分类的性能。(2) 在用于分类低 SNR 记录的四种损失函数中，焦点损失是最好的。(3) 澳大利亚和巴西青蛙记录分类的最高宏观 F1 分数分别为 89.26%±0.36% 和 93.47%±0.28%。

更新日期：2021-06-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11