当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generative adversarial networks for single channel separation of convolutive mixed speech signals
Neurocomputing ( IF 6 ) Pub Date : 2021-01-18 , DOI: 10.1016/j.neucom.2021.01.052
Yang Li , Wei-Tao Zhang , Shun-Tian Lou

The suppression of interference for speech recognition is of great significance in noisy situation, especially in single channel receiving mode, the suppression of interference is much more difficult. In this paper, we propose a generative adversarial network (GAN) based method for single channel dereverberation and speech separation. Different from the existing methods, our method considers the influence of strong reverberation on the observed signals. The proposed network involves two parts: reverberation suppression and target speech enhancement. Firstly, we use an improved CyclyGAN to compensate the multi-path effect on both target speech and interference. Secondly, we propose a differentialGAN to extract both target speech and interference while the interference enhancement network can indirectly improve the performance of target speech enhancement network. We use the real and imaginary parts of the complex spectrum as the feature vector, which avoids the phase mismatch during signal recovery. Simulation results show that our method is superior to its competitors in terms of multiple metrics in severe reverberation environment.



中文翻译:

生成对抗网络,用于卷积混合语音信号的单通道分离

在嘈杂的情况下,抑制语音识别的干扰具有重要意义,特别是在单通道接收模式下,抑制干扰要困难得多。在本文中,我们提出了一种基于生成对抗网络(GAN)的单通道混响和语音分离方法。与现有方法不同,我们的方法考虑了强混响对观测信号的影响。拟议的网络包括两个部分:混响抑制和目标语音增强。首先,我们使用改进的CyclyGAN补偿目标语音和干扰的多径影响。其次,我们提出了一种差分GAN来提取目标语音和干扰,而干扰增强网络可以间接提高目标语音增强网络的性能。我们使用复数频谱的实部和虚部作为特征向量,从而避免了信号恢复期间的相位失配。仿真结果表明,在严重混响环境下,该方法在多种指标上均优于竞争对手。

更新日期:2021-02-10
down
wechat
bug