当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
arXiv - CS - Sound Pub Date : 2021-07-22 , DOI: arxiv-2107.10469
Thi Ngoc Tho Nguyen, Karn N. Watcharasupat, Zhen Jian Lee, Ngoc Khanh Nguyen, Douglas L. Jones, Woon Seng Gan

Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct correspondences between the detected sound classes and directions of arrival to multiple overlapping sound events. Previous studies have shown that unknown interferences in reverberant environments often cause major degradation in the performance of SELD systems. To further understand the challenges of the SELD task, we performed a detailed error analysis on two of our SELD systems, which both ranked second in the team category of DCASE SELD Challenge, one in 2020 and one in 2021. Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest. In addition, the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.

中文翻译:

是什么让声音事件定位和检测变得困难?错误分析的见解

声音事件定位和检测(SELD)是一个新兴的研究课题,旨在统一声音事件检测和到达方向估计的任务。因此,SELD 继承了两项任务的挑战,例如噪声、混响、干扰、复音和声源的非平稳性。此外,SELD 经常面临额外的挑战,即为多个重叠的声音事件分配检测到的声音类别和到达方向之间的正确对应关系。先前的研究表明,混响环境中的未知干扰通常会导致 SELD 系统性能的严重下降。为了进一步了解 SELD 任务的挑战,我们对我们的两个 SELD 系统进行了详细的错误分析,这两个系统都在 DCASE SELD 挑战的团队类别中排名第二,一个在 2020 年,一个在 2021 年。实验结果表明,由于难以检测所有感兴趣的声音事件,复调是 SELD 中的主要挑战。此外,SELD 系统倾向于在训练集中占主导地位的和弦场景中犯更少的错误。
更新日期:2021-07-23
down
wechat
bug