当前位置: X-MOL 学术EURASIP J. Audio Speech Music Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online/offline score informed music signal decomposition: application to minus one
EURASIP Journal on Audio, Speech, and Music Processing ( IF 1.7 ) Pub Date : 2019-12-01 , DOI: 10.1186/s13636-019-0168-6
Antonio Jesús Munoz-Montoro , Julio José Carabias-Orti , Pedro Vera-Candeas , Francisco Jesús Canadas-Quesada , Nicolás Ruiz-Reyes

In this paper, we propose a score-informed source separation framework based on non-negative matrix factorization (NMF) and dynamic time warping (DTW) that suits for both offline and online systems. The proposed framework is composed of three stages: training, alignment, and separation. In the training stage, the score is encoded as a sequence of individual occurrences and unique combinations of notes denoted as score units. Then, we proposed a NMF-based signal model where the basis functions for each score unit are represented as a weighted combination of spectral patterns for each note and instrument in the score obtained from a trained a priori over-completed dictionary. In the alignment stage, the time-varying gains are estimated at frame level by computing the projection of each score unit basis function over the captured audio signal. Then, under the assumption that only a score unit is active at a time, we propose an online DTW scheme to synchronize the score information with the performance. Finally, in the separation stage, the obtained gains are refined using local low-rank NMF and the separated sources are obtained using a soft-filter strategy. The framework has been evaluated and compared with other state-of-the-art methods for single channel source separation of small ensembles and large orchestra ensembles obtaining reliable results in terms of SDR and SIR. Finally, our method has been evaluated in the specific task of acoustic minus one, and some demos are presented.

中文翻译:

在线/离线分数通知音乐信号分解:应用到减一

在本文中,我们提出了一种基于非负矩阵分解(NMF)和动态时间扭曲(DTW)的分数信息源分离框架,适用于离线和在线系统。提出的框架由三个阶段组成:训练、对齐和分离。在训练阶段,分数被编码为一系列单独出现的音符和表示为分数单元的独特音符组合。然后,我们提出了一个基于 NMF 的信号模型,其中每个分数单元的基函数表示为从训练过的先验过度完成字典获得的分数中每个音符和乐器的频谱模式的加权组合。在对齐阶段,通过计算每个得分单位基函数在捕获的音频信号上的投影,在帧级估计时变增益。然后,在一次只有一个评分单元处于活动状态的假设下,我们提出了一种在线 DTW 方案来将评分信息与性能同步。最后,在分离阶段,使用局部低秩 NMF 对获得的增益进行细化,并使用软滤波器策略获得分离的源。该框架已经过评估并与其他最先进的方法进行了比较,用于小型合奏和大型乐团合奏的单通道源分离,在 SDR 和 SIR 方面获得了可靠的结果。最后,我们的方法已经在声学减一的特定任务中进行了评估,并提供了一些演示。使用局部低秩 NMF 对获得的增益进行细化,并使用软滤波器策略获得分离的源。该框架已经过评估并与其他最先进的方法进行了比较,用于小型合奏和大型乐团合奏的单通道源分离,在 SDR 和 SIR 方面获得了可靠的结果。最后,我们的方法已经在声学减一的特定任务中进行了评估,并提供了一些演示。使用局部低秩 NMF 对获得的增益进行细化,并使用软滤波器策略获得分离的源。该框架已经过评估并与其他最先进的方法进行了比较,用于小型合奏和大型乐团合奏的单通道源分离,在 SDR 和 SIR 方面获得了可靠的结果。最后,我们的方法已经在声学减一的特定任务中进行了评估,并提供了一些演示。
更新日期:2019-12-01
down
wechat
bug