Musical noise suppression using a low-rank and sparse matrix decomposition approach,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Musical noise suppression using a low-rank and sparse matrix decomposition approach
Speech Communication ( IF 2.4 ) Pub Date : 2020-09-21 , DOI: 10.1016/j.specom.2020.09.001
Jishnu Sadasivan , Jitendra K. Dhiman , Chandra Sekhar Seelamantula

We address the problem of suppressing musical noise from speech enhanced using a short-time processing algorithm. Enhancement algorithms rely on noise statistics and errors in estimating the statistics lead to residual noise in the enhanced signal. A frequently encountered residual noise type is the so-called musical noise, which is a consequence of spurious peaks occurring at random locations in the time-frequency (t-f) plane. Typically, speech enhancement algorithms operate on a short-time basis and perform attenuation of noisy speech spectral coefficients, effectively leading to a spectrotemporal gain function. We show that in case of speech distorted by musical noise, the spectrotemporal gain function has a distinct signature: the musical noise components are sparse in the t-f domain, whereas the spectrotemporal gain corresponding to the speech region exhibits a low-rank structure. Based on this observation, we propose a low-rank and sparse matrix decomposition of the spectrotemporal gain function. We show that musical noise can be effectively suppressed by reconstructing the speech signal using only the low-rank component. Performance comparison in terms of subjective scores and spectrographic analysis shows that the proposed technique is superior compared with two benchmark techniques. The proposed technique could be used in tandem with any speech enhancement algorithm that gives rise to musical noise.

中文翻译：

使用低秩和稀疏矩阵分解方法抑制音乐噪声

我们解决了使用短时处理算法来抑制语音语音中的音乐噪声的问题。增强算法依赖于噪声统计信息，并且在估计统计信息时会导致误差，从而导致增强信号中残留噪声。经常遇到的残留噪声类型是所谓的音乐噪声，这是在时频（tf）平面中随机位置处出现虚假峰值的结果。典型地，语音增强算法在短时间的基础上运行并执行对有噪声的语音频谱系数的衰减，从而有效地导致了频谱时间增益函数。我们表明，在语音被音乐噪声扭曲的情况下，时空增益函数具有鲜明的特征：在tf域中，音乐噪声分量稀疏，而与语音区域相对应的时空增益则呈现低秩结构。基于此观察，我们提出了光谱时间增益函数的低秩和稀疏矩阵分解。我们表明，仅使用低阶分量来重构语音信号可以有效地抑制音乐噪声。在主观评分和光谱分析方面的性能比较表明，所提出的技术比两种基准技术优越。所提出的技术可以与产生音乐噪音的任何语音增强算法一起使用。

更新日期：2020-10-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11