当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Non causal deep learning based dereverberation
arXiv - CS - Sound Pub Date : 2020-09-06 , DOI: arxiv-2009.02832
Jorge Wuth, Richard M. Stern, Nestor Becerra Yoma

In this paper we demonstrate the effectiveness of non-causal context for mitigating the effects of reverberation in deep-learning-based automatic speech recognition (ASR) systems. First, the value of non-causal context using a non-causal FIR filter is shown by comparing the contributions of previous vs. future information. Second, MLP- and LSTM-based dereverberation networks were trained to confirm the effects of causal and non-causal context when used in ASR systems trained with clean speech. The non-causal deep-learning-based dereverberation provides a 45% relative reduction in word error rate (WER) compared to the popular weighted prediction error (WPE) method in experiments with clean training in the REVERB challenge. Finally, an expanded multicondition training procedure used in combination with a semi-enhanced test utterance generation based on combinations of reverberated and dereverberated signals is proposed to reduce any artifacts or distortion that may be introduced by the non-causal dereverberation methods. The combination of both approaches provided average relative reductions in WER equal to 10.9% and 6.0% when compared to the baseline system obtained with the most recent REVERB challenge recipe without and with WPE, respectively.

中文翻译:

基于非因果深度学习的去混响

在本文中,我们证明了非因果上下文在减轻基于深度学习的自动语音识别 (ASR) 系统中的混响影响方面的有效性。首先,通过比较先前与未来信息的贡献,显示使用非因果 FIR 滤波器的非因果上下文的价值。其次,训练了基于 MLP 和 LSTM 的去混响网络,以确认在使用干净语音训练的 ASR 系统中使用时因果和非因果上下文的影响。在 REVERB 挑战中进行干净训练的实验中,与流行的加权预测误差 (WPE) 方法相比,基于非因果深度学习的去混响可以相对降低 45% 的单词错误率 (WER)。最后,提出了一种扩展的多条件训练程序,结合基于混响和去混响信号组合的半增强测试话语生成,以减少非因果去混响方法可能引入的任何伪影或失真。与使用最新 REVERB 挑战配方获得的基线系统相比,两种方法的组合提供的 WER 平均相对降低分别为 10.9% 和 6.0%。
更新日期:2020-09-08
down
wechat
bug