Multi-task Language Modeling for Improving Speech Recognition of Rare Words,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-task Language Modeling for Improving Speech Recognition of Rare Words
arXiv - CS - Computation and Language Pub Date : 2020-11-23 , DOI: arxiv-2011.11715
Chao-Han Huck Yang, Linda Liu, Ankur Gandhe, Yile Gu, Anirudh Raju, Denis Filimonov, Ivan Bulyko

End-to-end automatic speech recognition (ASR) systems are increasingly popular due to their relative architectural simplicity and competitive performance. However, even though the average accuracy of these systems may be high, the performance on rare content words often lags behind hybrid ASR systems. To address this problem, second-pass rescoring is often applied. In this paper, we propose a second-pass system with multi-task learning, utilizing semantic targets (such as intent and slot prediction) to improve speech recognition performance. We show that our rescoring model with trained with these additional tasks outperforms the baseline rescoring model, trained with only the language modeling task, by 1.4% on a general test and by 2.6% on a rare word test set in term of word-error-rate relative (WERR).

中文翻译：

多任务语言建模可提高稀有词的语音识别能力

端到端自动语音识别（ASR）系统由于其相对的体系结构简单和具有竞争力的性能而越来越受欢迎。但是，即使这些系统的平均准确性可能很高，但稀有内容字词的性能通常仍落后于混合ASR系统。为了解决这个问题，通常采用第二遍记录。在本文中，我们提出了一种具有多任务学习的第二遍系统，该系统利用语义目标（例如意图和时隙预测）来提高语音识别性能。我们显示，经过训练的这些额外任务的评分模型优于仅使用语言建模任务进行训练的基线评分模型，在一般测试中，其词汇错误率比普通测试高1.4％，在稀有词测试集中，其得分高2.6％。相对费率（WERR）。

更新日期：2020-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文