当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A time–frequency smoothing neural network for speech enhancement
Speech Communication ( IF 2.4 ) Pub Date : 2020-09-21 , DOI: 10.1016/j.specom.2020.09.002
Wenhao Yuan

In the existing speech enhancement methods based on deep neural network (DNN), the network architectures are not designed for speech enhancement specially, which extract local features of noisy speech in a non-causal way. In this paper, inspired by the feature calculation method based on the time–frequency correlation in the improved minima controlled recursive averaging (IMCRA), by using the long short-term memory (LSTM) and convolutional neural network (CNN) to model the correlation in the time and frequency dimensions respectively, a time–frequency smoothing neural network is proposed for speech enhancement. In order to verify the effectiveness of the proposed network in speech enhancement, various causal speech enhancement systems are established based on different networks, and extensive experiments are carried out in terms of speech quality and intelligibility. The experimental results show that the proposed network yields better speech enhancement performance compared with the other networks.



中文翻译:

时频平滑神经网络用于语音增强

在现有的基于深度神经网络(DNN)的语音增强方法中,网络体系结构不是专门为语音增强而设计的,而是以非因果的方式提取嘈杂语音的局部特征。本文基于改进的最小控制递归平均(IMCRA)中基于时频相关性的特征计算方法,通过使用长短期记忆(LSTM)和卷积神经网络(CNN)对相关性进行建模,启发了该方法分别在时间和频率维度上,提出了一种时频平滑神经网络用于语音增强。为了验证所提出的网络在语音增强方面的有效性,基于不同的网络建立了各种因果语音增强系统,在语音质量和清晰度方面进行了广泛的实验。实验结果表明,与其他网络相比,该网络具有更好的语音增强性能。

更新日期:2020-09-23
down
wechat
bug