The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment
arXiv - CS - Sound Pub Date : 2020-04-02 , DOI: arxiv-2004.00960
Wei Zhou, Wilfried Michel, Kazuki Irie, Markus Kitza, Ralf Schl\"uter, Hermann Ney

We present a complete training pipeline to build a state-of-the-art hybrid HMM-based ASR system on the 2nd release of the TED-LIUM corpus. Data augmentation using SpecAugment is successfully applied to improve performance on top of our best SAT model using i-vectors. By investigating the effect of different maskings, we achieve improvements from SpecAugment on hybrid HMM models without increasing model size and training time. A subsequent sMBR training is applied to fine-tune the final acoustic model, and both LSTM and Transformer language models are trained and evaluated. Our best system achieves a 5.6% WER on the test set, which outperforms the previous state-of-the-art by 27% relative.

中文翻译：

用于 TED-LIUM 第 2 版的 RWTH ASR 系统：使用 SpecAugment 改进混合 HMM

我们提出了一个完整的训练管道，用于在 TED-LIUM 语料库的第二版中构建最先进的基于混合 HMM 的 ASR 系统。使用 SpecAugment 的数据增强已成功应用于提高我们使用 i-vectors 的最佳 SAT 模型的性能。通过研究不同掩蔽的影响，我们在混合 HMM 模型上实现了 SpecAugment 的改进，而不会增加模型大小和训练时间。随后的 sMBR 训练用于微调最终的声学模型，并对 LSTM 和 Transformer 语言模型进行训练和评估。我们最好的系统在测试集上实现了 5.6% 的 WER，相对于之前的最新技术水平高出 27%。

更新日期：2020-04-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>