当前位置: X-MOL 学术IEEE ACM Trans. Audio Speech Lang. Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Estimators of The Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty.
IEEE/ACM Transactions on Audio, Speech, and Language Processing ( IF 4.1 ) Pub Date : 2011-07-01 , DOI: 10.1109/tasl.2010.2082531
Yang Lu 1 , Philipos C Loizou
Affiliation  

Statistical estimators of the magnitude-squared spectrum are derived based on the assumption that the magnitude-squared spectrum of the noisy speech signal can be computed as the sum of the (clean) signal and noise magnitude-squared spectra. Maximum a posterior (MAP) and minimum mean square error (MMSE) estimators are derived based on a Gaussian statistical model. The gain function of the MAP estimator was found to be identical to the gain function used in the ideal binary mask (IdBM) that is widely used in computational auditory scene analysis (CASA). As such, it was binary and assumed the value of 1 if the local SNR exceeded 0 dB, and assumed the value of 0 otherwise. By modeling the local instantaneous SNR as an F-distributed random variable, soft masking methods were derived incorporating SNR uncertainty. The soft masking method, in particular, which weighted the noisy magnitude-squared spectrum by the a priori probability that the local SNR exceeds 0 dB was shown to be identical to the Wiener gain function. Results indicated that the proposed estimators yielded significantly better speech quality than the conventional MMSE spectral power estimators, in terms of yielding lower residual noise and lower speech distortion.

中文翻译:

幅度平方谱的估计器和纳入 SNR 不确定性的方法。

幅度平方谱的统计估计量是基于以下假设推导出来的,即带噪语音信号的幅度平方谱可以计算为(干净的)信号和噪声幅度平方谱的总和。基于高斯统计模型导出最大后验 (MAP) 和最小均方误差 (MMSE) 估计量。发现 MAP 估计器的增益函数与广泛用于计算听觉场景分析 (CASA) 的理想二元掩码 (IdBM) 中使用的增益函数相同。因此,它是二进制的,如果本地 SNR 超过 0 dB,则假定值为 1,否则假定值为 0。通过将局部瞬时 SNR 建模为 F 分布的随机变量,推导出包含 SNR 不确定性的软掩蔽方法。软掩蔽方法,特别是,通过局部 SNR 超过 0 dB 的先验概率对噪声幅度平方频谱加权的方法与维纳增益函数相同。结果表明,在产生较低的残余噪声和较低的语音失真方面,所提出的估计器产生的语音质量明显优于传统的 MMSE 谱功率估计器。
更新日期:2019-11-01
down
wechat
bug