Weakly Supervised Source-Specific Sound Level Estimation in Noisy Soundscapes,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weakly Supervised Source-Specific Sound Level Estimation in Noisy Soundscapes
arXiv - CS - Sound Pub Date : 2021-05-06 , DOI: arxiv-2105.02911
Aurora Cramer, Mark Cartwright, Fatemeh Pishdadian, Juan Pablo Bello

While the estimation of what sound sources are, when they occur, and from where they originate has been well-studied, the estimation of how loud these sound sources are has been often overlooked. Current solutions to this task, which we refer to as source-specific sound level estimation (SSSLE), suffer from challenges due to the impracticality of acquiring realistic data and a lack of robustness to realistic recording conditions. Recently proposed weakly supervised source separation offer a means of leveraging clip-level source annotations to train source separation models, which we augment with modified loss functions to bridge the gap between source separation and SSSLE and to address the presence of background. We show that our approach improves SSSLE performance compared to baseline source separation models and provide an ablation analysis to explore our method's design choices, showing that SSSLE in practical recording and annotation scenarios is possible.

中文翻译：

嘈杂声景中的弱监督特定于源的声级估计

虽然对声源是什么，何时发生以及从何而来的估计已进行了很好的研究，但通常却忽略了对这些声源的响度的估计。由于获取现实数据不切实际并且缺乏对现实录音条件的鲁棒性，当前针对该任务的解决方案（我们称为特定于声音的声级估计（SSSLE））面临挑战。最近提出的弱监督源分离提供了一种利用剪辑级源注释来训练源分离模型的方法，我们使用改进的损失函数来增强这些分离，以弥合源分离和SSSLE之间的鸿沟并解决背景的存在。

更新日期：2021-05-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文