GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech
Speech Communication ( IF 3.2 ) Pub Date : 2020-06-17 , DOI: 10.1016/j.specom.2020.06.001
Katsuhiko Yamamoto , Toshio Irino , Shoko Araki , Keisuke Kinoshita , Tomohiro Nakatani

In this study, we propose a new concept, the gammachirp envelope distortion index (GEDI), based on the signal-to-distortion ratio in the auditory envelope, SDR_env, to predict the intelligibility of speech enhanced by nonlinear algorithms. The objective of GEDI is to calculate the distortion between enhanced and clean-speech representations in the domain of a temporal envelope extracted by the gammachirp auditory filterbank and modulation filterbank. We also extend GEDI with multi-resolution analysis (mr-GEDI) to predict the speech intelligibility of sounds under non-stationary noise conditions. We evaluate GEDI in terms of the speech intelligibility predictions of speech sounds enhanced by a classic spectral subtraction and a Wiener filtering method. The predictions are compared with human results for various signal-to-noise ratio conditions with additive pink and babble noises. The results showed that mr-GEDI predicted the intelligibility curves better than short-time objective intelligibility (STOI) measure, extended-STOI (ESTOI) measure, and hearing-aid speech perception index (HASPI) under pink-noise conditions, and better than HASPI under babble-noise conditions. The mr-GEDI method does not present an overestimation tendency and is considered a more conservative approach than STOI and ESTOI. Therefore, the evaluation with mr-GEDI may provide additional information in the development of speech enhancement algorithms.

中文翻译：

GEDI：Gammachirp包络失真指数，用于预测增强语音的清晰度

在这项研究中，我们基于听觉包络中的信噪比SDR _env，提出了一个新概念gammachirp包络畸变指数（GEDI），以预测通过非线性算法增强的语音清晰度。GEDI的目的是在由gammachirp听觉滤波器组和调制滤波器组提取的时间包络域中计算增强和清晰语音表示之间的失真。我们还使用多分辨率分析（mr-GEDI）扩展了GEDI，以预测非平稳噪声条件下声音的语音清晰度。我们根据经典频谱减法和Wiener滤波方法增强的语音的语音清晰度预测来评估GEDI。将这些预测值与人类的结果进行比较，以得出各种信号噪声比条件下的附加粉红色和胡闹声。结果表明，mr-GEDI预测的清晰度曲线优于粉红噪声条件下的短期目标清晰度（STOI）措施，扩展STOI（ESTOI）措施和助听器语音感知指数（HASPI），并且优于在低噪声条件下的HASPI。mr-GEDI方法不存在高估趋势，并且被认为比STOI和ESTOI更保守。因此，在语音增强算法的开发中，使用mr-GEDI进行评估可能会提供其他信息。

更新日期：2020-06-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>