当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition
arXiv - CS - Sound Pub Date : 2021-07-14 , DOI: arxiv-2107.07029
Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo

Deep learning work on musical instrument recognition has generally focused on instrument classes for which we have abundant data. In this work, we exploit hierarchical relationships between instruments in a few-shot learning setup to enable classification of a wider set of musical instruments, given a few examples at inference. We apply a hierarchical loss function to the training of prototypical networks, combined with a method to aggregate prototypes hierarchically, mirroring the structure of a predefined musical instrument hierarchy. These extensions require no changes to the network architecture and new levels can be easily added or removed. Compared to a non-hierarchical few-shot baseline, our method leads to a significant increase in classification accuracy and significant decrease mistake severity on instrument classes unseen in training.

中文翻译:

利用层次结构进行少镜头乐器识别

关于乐器识别的深度学习工作通常集中在我们拥有丰富数据的乐器类别上。在这项工作中,我们利用了几次学习设置中乐器之间的层次关系,以对更广泛的乐器进行分类,并给出了一些推理示例。我们将分层损失函数应用于原型网络的训练,并结合一种分层聚合原型的方法,反映了预定义乐器层次结构的结构。这些扩展不需要对网络架构进行更改,并且可以轻松添加或删除新级别。与非分层的小样本基线相比,我们的方法显着提高了分类准确性,并显着降低了训练中未见过的仪器类别的错误严重程度。
更新日期:2021-07-16
down
wechat
bug