当前位置: X-MOL 学术Cognit. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SETransformer: Speech Enhancement Transformer
Cognitive Computation ( IF 5.4 ) Pub Date : 2021-02-03 , DOI: 10.1007/s12559-020-09817-2
Weiwei Yu , Jian Zhou , HuaBin Wang , Liang Tao

Speech enhancement is a fundamental way to improve speech perception quality in adverse environment where the received speech is seriously corrupted by noise. In this paper, we propose a cognitive computing based speech enhancement model termed SETransformer which can improve the speech quality in unkown noisy environments. The proposed SETransformer takes advantages of LSTM and multi-head attention mechanism, both of which are inspired by the auditory perception principle of human beings. Specifically, the SETransformer pocesses the ability of characterizing the local structure implicated in the speech spectrum and has more lower computation complexity due to its distinctive parallelization perfermance. Experimental results show that, compared with the standard Transformer and the LSTM model, the proposed SETransformer model can consistently achieve better denoising performance in terms of speech quality (PESQ) and speech intelligibility (STOI) under unseen noise conditions.



中文翻译:

SETransformer:语音增强变压器

语音增强是在恶劣环境中提高语音感知质量的基本方法,在这种环境中,接收到的语音被噪声严重破坏。在本文中,我们提出了一种基于认知计算的语音增强模型SETransformer,该模型可以改善未知噪声环境中的语音质量。拟议中的SETransformer利用了LSTM和多头注意力机制的优势,两者均受人类听觉感知原理的启发。具体而言,SETransformer具有表征语音频谱中包含的局部结构的能力,并且由于其独特的并行性能而具有更低的计算复杂度。实验结果表明,与标准Transformer和LSTM模型相比,

更新日期:2021-02-03
down
wechat
bug