Toward Speech Separation in The Pre-Cocktail Party Problem with TasTas,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Toward Speech Separation in The Pre-Cocktail Party Problem with TasTas
arXiv - CS - Sound Pub Date : 2020-09-07 , DOI: arxiv-2009.03692
Ziqiang Shi and Jiqing Han

In this note, we propose to use TasTas \cite{shi2020speech} for the end-to-end approach to monaural speech separation in the pre-cocktail party problem. Our experiments on the public WSJ0-5mix data corpus results in 10.41dB SDR improvement. If online voice data remixing augmentation \cite{zeghidour2020wavesplit} is adopted in training, an 11.14dB SDR improvement can be achieved. We have open-sourced our re-implementation of the DPRNN-TasNet in https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.

中文翻译：

TasTas 在鸡尾酒会前问题中的语音分离

在本说明中，我们建议使用 TasTas \cite{shi2020speech} 作为端对端方法来解决鸡尾酒会前问题中的单声道语音分离问题。我们在公共 WSJ0-5mix 数据语料库上的实验导致 10.41dB SDR 改进。如果在训练中采用在线语音数据混合增强\cite{zeghidour2020wavesplit}，可以实现11.14dB SDR 的改进。我们在 https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation 中开源了我们对 DPRNN-TasNet 的重新实现，我们的 TasTas 就是基于 DPRNN 的这种实现实现的-TasNet，相信本文的结果可以轻松复现。

更新日期：2020-10-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文