当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Overcoming label noise in audio event detection using sequential labeling
arXiv - CS - Sound Pub Date : 2020-07-10 , DOI: arxiv-2007.05191
Jae-Bin Kim, Seongkyu Mun, Myungwoo Oh, Soyeon Choe, Yong-Hyeok Lee, Hyung-Min Park

This paper addresses the noisy label issue in audio event detection (AED) by refining strong labels as sequential labels with inaccurate timestamps removed. In AED, strong labels contain the occurrence of a specific event and its timestamps corresponding to the start and end of the event in an audio clip. The timestamps depend on subjectivity of each annotator, and their label noise is inevitable. Contrary to the strong labels, weak labels indicate only the occurrence of a specific event. They do not have the label noise caused by the timestamps, but the time information is excluded. To fully exploit information from available strong and weak labels, we propose an AED scheme to train with sequential labels in addition to the given strong and weak labels after converting the strong labels into the sequential labels. Using sequential labels consistently improved the performance particularly with the segment-based F-score by focusing on occurrences of events. In the mean-teacher-based approach for semi-supervised learning, including an early step with sequential prediction in addition to supervised learning with sequential labels mitigated label noise and inaccurate prediction of the teacher model and improved the segment-based F-score significantly while maintaining the event-based F-score.

中文翻译:

使用顺序标签克服音频事件检测中的标签噪声

本文通过将强标签细化为去除不准确时间戳的连续标签来解决音频事件检测 (AED) 中的噪声标签问题。在 AED 中,强标签包含特定事件的发生及其对应于音频剪辑中事件开始和结束的时间戳。时间戳取决于每个注释者的主观性,它们的标签噪声是不可避免的。与强标签相反,弱标签仅表示特定事件的发生。它们没有时间戳引起的标签噪声,但排除了时间信息。为了充分利用来自可用强标签和弱标签的信息,我们提出了一种 AED 方案,在将强标签转换为顺序标签后,除了给定的强标签和弱标签之外,还使用顺序标签进行训练。通过关注事件的发生,使用顺序标签持续提高性能,特别是使用基于段的 F 分数。在基于平均教师的半监督学习方法中,除了使用顺序标签的监督学习之外,还包括顺序预测的早期步骤,减轻了标签噪声和教师模型的不准确预测,并显着提高了基于段的 F 分数,同时维护基于事件的 F 分数。
更新日期:2020-07-13
down
wechat
bug