Review of Time–Frequency Masking Approach for Improving Speech Intelligibility in Noise,IETE Technical Review

当前位置： X-MOL 学术 › IETE Tech. Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Review of Time–Frequency Masking Approach for Improving Speech Intelligibility in Noise
IETE Technical Review ( IF 2.5 ) Pub Date : 2021-02-22 , DOI: 10.1080/02564602.2021.1886610
Gibak Kim ₁

Affiliation

Over the last decade, time–frequency masking techniques have been explored to achieve substantial improvement of speech intelligibility in noise. Binary or soft mask can be applied to the noisy speech for speech separation. Binary masking approach retains the time–frequency (T–F) units of the noise-corrupted signal where the target speech is stronger than the interfering noise, and removes the T–F units where the interfering noise is dominant. While binary mask is 0 or 1, soft mask can take any value mostly in the range from 0 to 1, and is closely related to the frequency domain Wiener filter gain. Motivated by intelligibility studies of speech synthesized using the ideal binary (or soft) mask, a number of subsequent researches on estimating T–F mask have been conducted for practical use. This paper reviews the T–F masking strategies, covering the definition, preliminary studies with ideal mask, and the estimation of mask in practice.

中文翻译：

用于提高噪声中语音清晰度的时频掩蔽方法综述

在过去的十年中，人们探索了时频掩蔽技术以显着提高噪声中的语音清晰度。二进制或软掩码可以应用于嘈杂的语音以进行语音分离。二值掩蔽方法保留目标语音比干扰噪声强的噪声破坏信号的时频（T-F）单元，并去除干扰噪声占主导地位的T-F单元。二进制掩码为 0 或 1，而软掩码可以取任何值，大多在 0 到 1 的范围内，并且与频域维纳滤波器增益密切相关。受使用理想二进制（或软）掩码合成语音的可理解性研究的启发，随后对估计 T-F 掩码进行了许多研究以供实际使用。本文回顾了 T-F 掩蔽策略，

更新日期：2021-02-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11