当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Time-domain Monaural Speech Enhancement with Feedback Learning
arXiv - CS - Sound Pub Date : 2020-03-22 , DOI: arxiv-2003.09815
Andong Li, Chengshi Zheng, Linjuan Cheng, Renhua Peng, Xiaodong Li

In this paper, we propose a type of neural network with feedback learning in the time domain called FTNet for monaural speech enhancement, where the proposed network consists of three principal components. The first part is called stage recurrent neural network, which is introduced to effectively aggregate the deep feature dependencies across different stages with a memory mechanism and also remove the interference stage by stage. The second part is the convolutional auto-encoder. The third part consists of a series of concatenated gated linear units, which are capable of facilitating the information flow and gradually increasing the receptive fields. Feedback learning is adopted to improve the parameter efficiency and therefore, the number of trainable parameters is effectively reduced without sacrificing its performance. Numerous experiments are conducted on TIMIT corpus and experimental results demonstrate that the proposed network can achieve consistently better performance in terms of both PESQ and STOI scores than two state-of-the-art time domain-based baselines in different conditions.

中文翻译:

具有反馈学习的时域单声道语音增强

在本文中,我们提出了一种在时域中具有反馈学习的神经网络,称为 FTNet,用于单声道语音增强,其中提出的网络由三个主要组件组成。第一部分称为阶段循环神经网络,引入它以通过记忆机制有效聚合不同阶段的深度特征依赖关系,并逐步消除干扰。第二部分是卷积自编码器。第三部分由一系列串联的门控线性单元组成,它们能够促进信息流动并逐渐增加感受野。采用反馈学习来提高参数效率,因此在不牺牲其性能的情况下有效减少了可训练参数的数量。
更新日期:2020-11-06
down
wechat
bug