当前位置: X-MOL 学术EURASIP J. Audio Speech Music Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved capsule routing for weakly labeled sound event detection
EURASIP Journal on Audio, Speech, and Music Processing ( IF 2.4 ) Pub Date : 2022-03-07 , DOI: 10.1186/s13636-022-00239-6
Haitao Li 1 , Shuguo Yang 1 , Wenwu Wang 2
Affiliation  

Polyphonic sound event detection aims to detect the types of sound events that occur in given audio clips, and their onset and offset times, in which multiple sound events may occur simultaneously. Deep learning–based methods such as convolutional neural networks (CNN) achieved state-of-the-art results in polyphonic sound event detection. However, two open challenges still remain: overlap between events and prone to overfitting problem. To solve the above two problems, we proposed a capsule network-based method for polyphonic sound event detection. With so-called dynamic routing, capsule networks have the advantage of handling overlapping objects and the generalization ability to reduce overfitting. However, dynamic routing also greatly slows down the training process. In order to speed up the training process, we propose a weakly labeled polyphonic sound event detection model based on the improved capsule routing. Our proposed method is evaluated on task 4 of the DCASE 2017 challenge and compared with several baselines, demonstrating competitive results in terms of F-score and computational efficiency.

中文翻译:

改进了用于弱标记声音事件检测的胶囊路由

和弦声音事件检测旨在检测给定音频剪辑中发生的声音事件的类型,以及它们的开始和偏移时间,其中多个声音事件可能同时发生。卷积神经网络 (CNN) 等基于深度学习的方法在和弦声音事件检测方面取得了最先进的结果。然而,仍然存在两个开放的挑战:事件之间的重叠和容易出现过拟合问题。为了解决上述两个问题,我们提出了一种基于胶囊网络的和弦声音事件检测方法。通过所谓的动态路由,胶囊网络具有处理重叠对象的优势和减少过拟合的泛化能力。然而,动态路由也大大减慢了训练过程。为了加快训练过程,我们提出了一种基于改进的胶囊路由的弱标记和弦声音事件检测模型。我们提出的方法在 DCASE 2017 挑战的任务 4 上进行了评估,并与几个基线进行了比较,在 F 分数和计算效率方面展示了具有竞争力的结果。
更新日期:2022-03-07
down
wechat
bug