当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks
arXiv - CS - Sound Pub Date : 2019-09-26 , DOI: arxiv-1909.11909
Chang-Le Liu, Sze-Wei Fu, You-Jin Li, Jen-Wei Huang, Hsin-Min Wang, Yu Tsao

In recent years, waveform-mapping-based speech enhancement (SE) methods have garnered significant attention. These methods generally use a deep learning model to directly process and reconstruct speech waveforms. Because both the input and output are in waveform format, the waveform-mapping-based SE methods can overcome the distortion caused by imperfect phase estimation, which may be encountered in spectral-mapping-based SE systems. So far, most waveform-mapping-based SE methods have focused on single-channel tasks. In this paper, we propose a novel fully convolutional network (FCN) with Sinc and dilated convolutional layers (termed SDFCN) for multichannel SE that operates in the time domain. We also propose an extended version of SDFCN, called the residual SDFCN (termed rSDFCN). The proposed methods are evaluated on two multichannel SE tasks, namely the dual-channel inner-ear microphones SE task and the distributed microphones SE task. The experimental results confirm the outstanding denoising capability of the proposed SE systems on both tasks and the benefits of using the residual architecture on the overall SE performance.

中文翻译:

使用全卷积网络通过原始波形映射进行多通道语音增强

近年来,基于波形映射的语音增强(SE)方法引起了极大的关注。这些方法一般使用深度学习模型直接对语音波形进行处理和重构。由于输入和输出都是波形格式,基于波形映射的 SE 方法可以克服在基于频谱映射的 SE 系统中可能遇到的由不完善的相位估计引起的失真。到目前为止,大多数基于波形映射的 SE 方法都专注于单通道任务。在本文中,我们为在时域中运行的多通道 SE 提出了一种具有 Sinc 和扩张卷积层(称为 SDFCN)的新型全卷积网络 (FCN)。我们还提出了 SDFCN 的扩展版本,称为残差 SDFCN(称为 rSDFCN)。所提出的方法在两个多通道 SE 任务上进行了评估,即双通道内耳麦克风 SE 任务和分布式麦克风 SE 任务。实验结果证实了所提出的 SE 系统在两个任务上的出色去噪能力以及使用残差架构对整体 SE 性能的好处。
更新日期:2020-02-25
down
wechat
bug