Detecting Acoustic Events Using Convolutional Macaron Net,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Detecting Acoustic Events Using Convolutional Macaron Net
arXiv - CS - Sound Pub Date : 2020-09-21 , DOI: arxiv-2009.09632
Teck Kai Chan, Cheng Siong Chin

In this paper, we propose to address the issue of the lack of strongly labeled data by using pseudo strongly labeled data that is approximated using Convolutive Nonnegative Matrix Factorization (CNMF). Using this pseudo strongly labeled data, we then train a new architecture combining Convolutional Neural Network (CNN) with Macaron Net (MN), which we term it as Convolutional Macaron Net (CMN). As opposed to the Mean-Teacher approach which trains two similar models synchronously, we propose to train two different CMNs synchronously where one of the models will provide the frame-level prediction while the other will provide the clip level prediction. Based on our proposed framework, our system outperforms the baseline system of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Challenge Task 4 by a margin of over 10%. By comparing with the first place of the challenge which utilize a combination of CNN and Conformer, our system also marginally wins it by 0.3%.

中文翻译：

使用卷积 Macaron 网络检测声学事件

在本文中，我们建议通过使用使用卷积非负矩阵分解 (CNMF) 近似的伪强标记数据来解决缺乏强标记数据的问题。使用这个伪强标记数据，我们然后训练一个结合卷积神经网络 (CNN) 和 Macaron Net (MN) 的新架构，我们将其称为卷积 Macaron Net (CMN)。与同步训练两个相似模型的 Mean-Teacher 方法相反，我们建议同步训练两个不同的 CMN，其中一个模型将提供帧级预测，而另一个将提供剪辑级预测。基于我们提出的框架，我们的系统比声学场景和事件检测和分类 (DCASE) 2020 挑战任务 4 的基线系统高出 10% 以上。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文