当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Toward Speech Separation in The Pre-Cocktail Party Problem with TasTas
arXiv - CS - Sound Pub Date : 2020-09-07 , DOI: arxiv-2009.03692
Ziqiang Shi and Jiqing Han

In this note, we propose to use TasTas \cite{shi2020speech} for the end-to-end approach to monaural speech separation in the pre-cocktail party problem. Our experiments on the public WSJ0-5mix data corpus results in 10.41dB SDR improvement. If online voice data remixing augmentation \cite{zeghidour2020wavesplit} is adopted in training, an 11.14dB SDR improvement can be achieved. We have open-sourced our re-implementation of the DPRNN-TasNet in https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.

中文翻译:

TasTas 在鸡尾酒会前问题中的语音分离

在本说明中,我们建议使用 TasTas \cite{shi2020speech} 作为端对端方法来解决鸡尾酒会前问题中的单声道语音分离问题。我们在公共 WSJ0-5mix 数据语料库上的实验导致 10.41dB SDR 改进。如果在训练中采用在线语音数据混合增强\cite{zeghidour2020wavesplit},可以实现11.14dB SDR 的改进。我们在 https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation 中开源了我们对 DPRNN-TasNet 的重新实现,我们的 TasTas 就是基于 DPRNN 的这种实现实现的-TasNet,相信本文的结果可以轻松复现。
更新日期:2020-10-29
down
wechat
bug