Deep Learning Based Real-Time Speech Enhancement for Dual-Microphone Mobile Phones,IEEE/ACM Transactions on Audio, Speech, and Language Processing

当前位置： X-MOL 学术 › IEEE ACM Trans. Audio Speech Lang. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Learning Based Real-Time Speech Enhancement for Dual-Microphone Mobile Phones
IEEE/ACM Transactions on Audio, Speech, and Language Processing ( IF 4.1 ) Pub Date : 2021-05-20 , DOI: 10.1109/taslp.2021.3082318
Ke Tan ₁ , Xueliang Zhang ₂ , DeLiang Wang ₃

Affiliation

In mobile speech communication, speech signals can be severely corrupted by background noise when the far-end talker is in a noisy acoustic environment. To suppress background noise, speech enhancement systems are typically integrated into mobile phones, in which one or more microphones are deployed. In this study, we propose a novel deep learning based approach to real-time speech enhancement for dual-microphone mobile phones. The proposed approach employs a new densely-connected convolutional recurrent network to perform dual-channel complex spectral mapping. We utilize a structured pruning technique to compress the model without significantly degrading the enhancement performance, which yields a low-latency and memory-efficient enhancement system for real-time processing. Experimental results suggest that the proposed approach consistently outperforms an earlier approach to dual-channel speech enhancement for mobile phone communication, as well as a deep learning based beamformer.

中文翻译：

基于深度学习的双麦克风手机实时语音增强

在移动语音通信中，当远端讲话者处于嘈杂的声学环境中时，语音信号可能会被背景噪声严重破坏。为了抑制背景噪声，语音增强系统通常集成到移动电话中，其中部署了一个或多个麦克风。在这项研究中，我们提出了一种基于深度学习的新颖方法，用于双麦克风手机的实时语音增强。所提出的方法采用新的密集连接的卷积循环网络来执行双通道复杂的光谱映射。我们利用结构化剪枝技术来压缩模型，而不会显着降低增强性能，从而产生低延迟且内存高效的实时处理增强系统。实验结果表明，所提出的方法始终优于早期的手机通信双通道语音增强方法以及基于深度学习的波束形成器。

更新日期：2021-05-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文