当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pitch-Informed Instrument Assignment Using a Deep Convolutional Network with Multiple Kernel Shapes
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-07-28 , DOI: arxiv-2107.13617
Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck

This paper proposes a deep convolutional neural network for performing note-level instrument assignment. Given a polyphonic multi-instrumental music signal along with its ground truth or predicted notes, the objective is to assign an instrumental source for each note. This problem is addressed as a pitch-informed classification task where each note is analysed individually. We also propose to utilise several kernel shapes in the convolutional layers in order to facilitate learning of efficient timbre-discriminative feature maps. Experiments on the MusicNet dataset using 7 instrument classes show that our approach is able to achieve an average F-score of 0.904 when the original multi-pitch annotations are used as the pitch information for the system, and that it also excels if the note information is provided using third-party multi-pitch estimation algorithms. We also include ablation studies investigating the effects of the use of multiple kernel shapes and comparing different input representations for the audio and the note-related information.

中文翻译:

使用具有多个内核形状的深度卷积网络进行音调信息仪器分配

本文提出了一种用于执行音符级乐器分配的深度卷积神经网络。给定一个和弦多乐器音乐信号及其基本事实或预测音符,目标是为每个音符分配一个乐器源。这个问题作为音高信息分类任务来解决,其中每个音符都被单独分析。我们还建议在卷积层中使用几个内核形状,以便于学习有效的音色区分特征图。使用 7 个乐器类在 MusicNet 数据集上进行的实验表明,当原始多音高注释用作系统的音高信息时,我们的方法能够实现 0.904 的平均 F 分数,并且如果使用第三方多音高估计算法提供音符信息,它也很出色。我们还包括消融研究,调查使用多个内核形状的影响并比较音频和音符相关信息的不同输入表示。
更新日期:2021-07-30
down
wechat
bug