当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement
arXiv - CS - Sound Pub Date : 2021-05-06 , DOI: arxiv-2105.02436
Kanghao Zhang, Shulin He, Hao Li, Xueliang Zhang

In real acoustic environment, speech enhancement is an arduous task to improve the quality and intelligibility of speech interfered by background noise and reverberation. Over the past years, deep learning has shown great potential on speech enhancement. In this paper, we propose a novel real-time framework called DBNet which is a dual-branch structure with alternate interconnection. Each branch incorporates an encoder-decoder architecture with skip connections. The two branches are responsible for spectrum and waveform modeling, respectively. A bridge layer is adopted to exchange information between the two branches. Systematic evaluation and comparison show that the proposed system substantially outperforms related algorithms under very challenging environments. And in INTERSPEECH 2021 Deep Noise Suppression (DNS) challenge, the proposed system ranks the top 8 in real-time track 1 in terms of the Mean Opinion Score (MOS) of the ITU-T P.835 framework.

中文翻译:

DBNet:一种用于频谱和波形的双分支网络架构处理,用于单通道语音增强

在真实的声学环境中,语音增强是一项艰巨的任务,要提高受背景噪声和混响干扰的语音的质量和清晰度。在过去的几年中,深度学习在语音增强方面显示出了巨大的潜力。在本文中,我们提出了一种称为DBNet的新颖实时框架,该框架是具有交替互连的双分支结构。每个分支都包含具有跳过连接的编码器-解码器体系结构。这两个分支分别负责频谱和波形建模。采用桥接层在两个分支之间交换信息。系统评估和比较表明,在极富挑战性的环境下,所提出的系统的性能明显优于相关算法。在INTERSPEECH 2021深度噪声抑制(DNS)挑战中,
更新日期:2021-05-07
down
wechat
bug