当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bioacoustic signal classification in continuous recordings: Syllable-segmentation vs sliding-window
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2020-03-16 , DOI: 10.1016/j.eswa.2020.113390
Jie Xie , Kai Hu , Mingying Zhu , Ya Guo

Frog population has been experiencing rapid decreases worldwide, which is regarded as one of the most critical threats to the global biodiversity. Therefore, large volumes of frog recordings have been collected for assessing this decline. Building an automatic frog species classification system is becoming ever more important. The traditional system for classifying frog species consists of four steps: (1) bioacoustic signal preprocessing, (2) segmentation, (3) feature extraction, (4) classification. Each prior step has a direct impact on the subsequent step. Consequently, the final classification performance is highly affected by the initial three steps. However, the performance of bioacoustic signal segmentation is highly dependent on the background noise of those environmental recordings. In this study, we propose an end-to-end approach for acoustic classification of frog species in continuous recordings. First, a sliding window is used to segment the audio signal into frames. Then, 1D-Convolution Neural Network and long short-term memory (CNN-LSTM) network is used to learn a representation from the raw audio signal, where three Convolutional layers and one LSTM layer are used to capture the signal’s pattern. Experimental results in classifying 23 Australian frog species demonstrate the effectiveness of our proposed CNN-LSTM based method. Compared to the syllable-segmentation based frog species classification system, our proposed CNN-LSTM based approach is more robust in frog species classification under various noisy conditions.



中文翻译:

连续录音中的生物声信号分类:音节分段与滑动窗口

蛙类种群在世界范围内正在迅速减少,这被认为是对全球生物多样性最严重的威胁之一。因此,已收集了大量的青蛙录音以评估这种下降。建立自动青蛙种类分类系统变得越来越重要。传统的蛙类分类系统包括四个步骤:(1)生物声信号预处理,(2)分割,(3)特征提取,(4)分类。每个先前的步骤都会对后续步骤产生直接影响。因此,最终的分类性能会受到最初的三个步骤的极大影响。但是,生物声信号分割的性能高度依赖于那些环境记录的背景噪声。在这个研究中,我们提出了一种连续记录中青蛙种类的声学分类的端到端方法。首先,使用滑动窗口将音频信号分段为帧。然后,使用一维卷积神经网络和长短期记忆(CNN-LSTM)网络从原始音频信号中学习表示,其中三个卷积层和一个LSTM层用于捕获信号的模式。对23种澳大利亚蛙种进行分类的实验结果证明了我们提出的基于CNN-LSTM的方法的有效性。与基于音节分割的青蛙种类分类系统相比,我们提出的基于CNN-LSTM的方法在各种嘈杂条件下的青蛙种类分类中均更可靠。滑动窗口用于将音频信号分段。然后,使用一维卷积神经网络和长短期记忆(CNN-LSTM)网络从原始音频信号中学习表示,其中三个卷积层和一个LSTM层用于捕获信号的模式。对23种澳大利亚蛙种进行分类的实验结果证明了我们提出的基于CNN-LSTM的方法的有效性。与基于音节分割的青蛙种类分类系统相比,我们提出的基于CNN-LSTM的方法在各种嘈杂条件下的青蛙种类分类中均更可靠。滑动窗口用于将音频信号分段。然后,使用一维卷积神经网络和长短期记忆(CNN-LSTM)网络从原始音频信号中学习表示,其中三个卷积层和一个LSTM层用于捕获信号的模式。对23种澳大利亚蛙种进行分类的实验结果证明了我们提出的基于CNN-LSTM的方法的有效性。与基于音节分割的青蛙种类分类系统相比,我们提出的基于CNN-LSTM的方法在各种嘈杂条件下的青蛙种类分类中均更可靠。其中三个卷积层和一个LSTM层用于捕获信号的模式。对23种澳大利亚蛙种进行分类的实验结果证明了我们提出的基于CNN-LSTM的方法的有效性。与基于音节分割的青蛙种类分类系统相比,我们提出的基于CNN-LSTM的方法在各种嘈杂条件下的青蛙种类分类中均更可靠。其中三个卷积层和一个LSTM层用于捕获信号的模式。对23种澳大利亚蛙种进行分类的实验结果证明了我们提出的基于CNN-LSTM的方法的有效性。与基于音节分割的青蛙种类分类系统相比,我们提出的基于CNN-LSTM的方法在各种嘈杂条件下的青蛙种类分类中均更可靠。

更新日期:2020-03-16
down
wechat
bug