PureMIC: A New Audio Dataset for the Classification of Musical Instruments based on Convolutional Neural Networks,Journal of Signal Processing Systems

当前位置： X-MOL 学术 › J. Sign. Process. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PureMIC: A New Audio Dataset for the Classification of Musical Instruments based on Convolutional Neural Networks
Journal of Signal Processing Systems ( IF 1.8 ) Pub Date : 2021-04-01 , DOI: 10.1007/s11265-021-01661-3
Gonçalo Castel-Branco , Gabriel Falcao , Fernando Perdigão

Automatic classification of musical instruments from audio relies heavily on datasets of acoustic recordings of the instruments to train models of those instruments. To do this, precise labels of the instrument’s events are mandatory. Also, it is very difficult to obtain such labels, especially in polyphonic performances. OpenMic-2018 is a polyphonic dataset created specifically with the aim to train instrument models. However, this dataset is based on weak and incomplete labels. The automatic classification of sound events, based on the VGGish bottleneck layer as proposed before by the AudioSet, implies the classification of only one second at a time, making it hard to find the label of that exact moment. To answer this question, this paper proposes PureMIC, a new strongly labeled dataset (SLD) that isolates 1000 single instrument clips manually labeled. Moreover, the proposed model classifies clips over time and also enhances the labeling robustness of a high number of unlabeled samples in OpenMIC-2018 due to its ability of classification over time. In the paper we disambiguate and report the automatic labeling of previously unlabeled samples. The proposed new labels achieve a mean average precision (mAP) of 0.701 for OpenMIC test data, outperforming its baseline (0.66). The code is released online so that the research community can replicate and follow the proposed implementation.

中文翻译：

PureMIC：基于卷积神经网络的乐器分类的新音频数据集

从音频对乐器进行自动分类在很大程度上依赖于乐器的声学记录数据集来训练那些乐器的模型。为此，必须对仪器事件进行准确的标记。而且，很难获得这种标签，尤其是在复音演奏中。OpenMic-2018是专门为训练乐器模型而创建的复音数据集。但是，此数据集基于薄弱和不完整的标签。音频事件的自动分类基于AudioSet之前提出的VGGish瓶颈层，意味着一次只能分类一秒钟，因此很难找到确切时刻的标签。为了回答这个问题，本文提出了PureMIC，这是一个新的强标记数据集（SLD），它可以隔离1000个单个仪器片段手动标记。此外，由于模型具有随时间进行分类的能力，因此所提出的模型可以随时间对片段进行分类，还可以增强OpenMIC-2018中大量未标记样本的标记鲁棒性。在本文中，我们消除了歧义，并报告了先前未标记样品的自动标记。拟议的新标签对OpenMIC测试数据的平均平均精度（mAP）为0.701，优于其基线（0.66）。该代码在线发布，以便研究团体可以复制并遵循建议的实施方案。

更新日期：2021-04-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>