当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment
arXiv - CS - Sound Pub Date : 2020-04-02 , DOI: arxiv-2004.00960
Wei Zhou, Wilfried Michel, Kazuki Irie, Markus Kitza, Ralf Schl\"uter, Hermann Ney

We present a complete training pipeline to build a state-of-the-art hybrid HMM-based ASR system on the 2nd release of the TED-LIUM corpus. Data augmentation using SpecAugment is successfully applied to improve performance on top of our best SAT model using i-vectors. By investigating the effect of different maskings, we achieve improvements from SpecAugment on hybrid HMM models without increasing model size and training time. A subsequent sMBR training is applied to fine-tune the final acoustic model, and both LSTM and Transformer language models are trained and evaluated. Our best system achieves a 5.6% WER on the test set, which outperforms the previous state-of-the-art by 27% relative.

中文翻译:

用于 TED-LIUM 第 2 版的 RWTH ASR 系统:使用 SpecAugment 改进混合 HMM

我们提出了一个完整的训练管道,用于在 TED-LIUM 语料库的第二版中构建最先进的基于混合 HMM 的 ASR 系统。使用 SpecAugment 的数据增强已成功应用于提高我们使用 i-vectors 的最佳 SAT 模型的性能。通过研究不同掩蔽的影响,我们在混合 HMM 模型上实现了 SpecAugment 的改进,而不会增加模型大小和训练时间。随后的 sMBR 训练用于微调最终的声学模型,并对 LSTM 和 Transformer 语言模型进行训练和评估。我们最好的系统在测试集上实现了 5.6% 的 WER,相对于之前的最新技术水平高出 27%。
更新日期:2020-04-03
down
wechat
bug