Multi-channel U-Net for Music Source Separation,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-channel U-Net for Music Source Separation
arXiv - CS - Information Retrieval Pub Date : 2020-03-23 , DOI: arxiv-2003.10414
Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro, Emilia G\'omez

A fairly straightforward approach for music source separation is to train independent models, wherein each model is dedicated for estimating only a specific source. Training a single model to estimate multiple sources generally does not perform as well as the independent dedicated models. However, Conditioned U-Net (C-U-Net) uses a control mechanism to train a single model for multi-source separation and attempts to achieve a performance comparable to that of the dedicated models. We propose a multi-channel U-Net (M-U-Net) trained using a weighted multi-task loss as an alternative to the C-U-Net. We investigate two weighting strategies for our multi-task loss: 1) Dynamic Weighted Average (DWA), and 2) Energy Based Weighting (EBW). DWA determines the weights by tracking the rate of change of loss of each task during training. EBW aims to neutralize the effect of the training bias arising from the difference in energy levels of each of the sources in a mixture. Our methods provide three-fold advantages compared to C-UNet: 1) Fewer effective training iterations per epoch, 2) Fewer trainable network parameters (no control parameters), and 3) Faster processing at inference. Our methods achieve performance comparable to that of C-U-Net and the dedicated U-Nets at a much lower training cost.

中文翻译：

用于音乐源分离的多通道 U-Net

音乐源分离的一种相当直接的方法是训练独立模型，其中每个模型仅用于估计特定源。训练单个模型来估计多个源的性能通常不如独立的专用模型。然而，Conditioned U-Net (CU-Net) 使用控制机制来训练单个模型进行多源分离，并试图实现与专用模型相当的性能。我们提出了一种使用加权多任务损失训练的多通道 U-Net (MU-Net) 作为 CU-Net 的替代方案。我们研究了多任务损失的两种加权策略：1) 动态加权平均 (DWA)，和 2) 基于能量的加权 (EBW)。DWA 通过跟踪训练期间每个任务的损失变化率来确定权重。EBW 旨在抵消由混合物中每个源的能级差异引起的训练偏差的影响。与 C-UNet 相比，我们的方法提供了三重优势：1) 每个 epoch 的有效训练迭代次数更少，2) 可训练的网络参数更少（无控制参数），以及 3) 推理处理速度更快。我们的方法以低得多的训练成本实现了与 CU-Net 和专用 U-Net 相当的性能。

更新日期：2020-09-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>