Speech enhancement using progressive learning-based convolutional recurrent neural network,Applied Acoustics

当前位置： X-MOL 学术 › Appl. Acoust. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Speech enhancement using progressive learning-based convolutional recurrent neural network
Applied Acoustics ( IF 3.4 ) Pub Date : 2020-09-01 , DOI: 10.1016/j.apacoust.2020.107347
Andong Li , Minmin Yuan , Chengshi Zheng , Xiaodong Li

Abstract Recently, progressive learning has shown its capacity to improve speech quality and speech intelligibility when it is combined with deep neural network (DNN) and long short-term memory (LSTM) based monaural speech enhancement algorithms, especially in low signal-to-noise ratio (SNR) conditions. Nevertheless, due to a large number of parameters and high computational complexity, it is hard to implement in current resource-limited micro-controllers and thus, it is essential to significantly reduce both the number of parameters and the computational load for practical applications. For this purpose, we propose a novel progressive learning framework with causal convolutional recurrent neural networks called PL-CRNN, which takes advantage of both convolutional neural networks and recurrent neural networks to drastically reduce the number of parameters and simultaneously improve speech quality and speech intelligibility. Numerous experiments verify the effectiveness of the proposed PL-CRNN model and indicate that it yields consistent better performance than the PL-DNN and PL-LSTM algorithms and also it gets results close even better than the CRNN in terms of objective measurements. Compared with PL-DNN, PL-LSTM, and CRNN, the proposed PL-CRNN algorithm can reduce the number of parameters up to 93%, 97%, and 92%, respectively.

中文翻译：

使用基于渐进学习的卷积递归神经网络增强语音

摘要最近，渐进式学习与基于深度神经网络 (DNN) 和长短期记忆 (LSTM) 的单声道语音增强算法相结合，尤其是在低信噪比的情况下，已经显示出其提高语音质量和语音清晰度的能力。比 (SNR) 条件。然而，由于参数数量多、计算复杂度高，很难在当前资源有限的微控制器中实现，因此在实际应用中显着减少参数数量和计算负载是必不可少的。为此，我们提出了一种新的渐进式学习框架，该框架具有称为 PL-CRNN 的因果卷积循环神经网络，它利用卷积神经网络和循环神经网络来大幅减少参数数量，同时提高语音质量和语音清晰度。大量实验验证了所提出的 PL-CRNN 模型的有效性，并表明它比 PL-DNN 和 PL-LSTM 算法产生了一致的更好性能，并且在客观测量方面它的结果甚至比 CRNN 更接近。与 PL-DNN、PL-LSTM 和 CRNN 相比，本文提出的 PL-CRNN 算法可以分别减少高达 93%、97% 和 92% 的参数数量。大量实验验证了所提出的 PL-CRNN 模型的有效性，并表明它比 PL-DNN 和 PL-LSTM 算法产生了一致的更好性能，并且在客观测量方面它的结果甚至比 CRNN 更接近。与 PL-DNN、PL-LSTM 和 CRNN 相比，本文提出的 PL-CRNN 算法可以分别减少高达 93%、97% 和 92% 的参数数量。大量实验验证了所提出的 PL-CRNN 模型的有效性，并表明它比 PL-DNN 和 PL-LSTM 算法产生了一致的更好性能，并且在客观测量方面它的结果甚至比 CRNN 更接近。与 PL-DNN、PL-LSTM 和 CRNN 相比，本文提出的 PL-CRNN 算法可以分别减少高达 93%、97% 和 92% 的参数数量。

更新日期：2020-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11