当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement
IEEE Signal Processing Letters ( IF 3.2 ) Pub Date : 2021-03-05 , DOI: 10.1109/lsp.2021.3063888
Zihao Cui 1 , Changchun Bao 1
Affiliation  

In this letter, a novel weighted mean square error (WMSE) is proposed to improve the DNN-based mask approximation method for speech enhancement, in which the weighting is closely related to the power exponent about noisy spectrum amplitude (NSA) base. The power exponents 0 and 2 separately reflect ideal amplitude masking (IAM) without any clippings and the indirect mapping (IM) on short-time spectral amplitude (STSA), and it is highly related to the enhanced spectrum and the performance of the enhanced signal based on the tests. Also, the experimental results show that the outstanding weighting is the noisy spectrum base with the power exponent 1 for the phase-unaware masking and results in better harmonic structure restoration. The objective function with the WMSE on the NSA (WMSE-NSA) can averagely improve 0.1 on the test of perceptual evaluation of speech quality (PESQ) and 1.7% on the test of short-time objective intelligibility (STOI) compared with the MSE-based mask approximation methods.

中文翻译:

语音增强中基于DNN的蒙版近似的基于幂指数的加权准则

在这封信中,提出了一种新颖的加权均方误差(WMSE),以改进基于DNN的语音增强掩模近似方法,该方法中的加权与关于噪声频谱幅度(NSA)的幂指数密切相关。幂指数0和2分别反映理想幅度屏蔽(IAM)而没有任何削波和短时频谱幅度(STSA)上的间接映射(IM),并且与增强频谱和增强信号的性能高度相关根据测试。此外,实验结果表明,出色的加权是带有功率指数1的噪声频谱库,用于相位不感知的掩蔽,并导致更好的谐波结构恢复。NSA上的WMSE的目标函数(WMSE-NSA)平均可以提高0。
更新日期:2021-04-13
down
wechat
bug