当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Homophone-based Label Smoothing in End-to-End Automatic Speech Recognition
arXiv - CS - Sound Pub Date : 2020-04-07 , DOI: arxiv-2004.03437 Yi Zheng, Xianjie Yang, Xuyong Dang
arXiv - CS - Sound Pub Date : 2020-04-07 , DOI: arxiv-2004.03437 Yi Zheng, Xianjie Yang, Xuyong Dang
A new label smoothing method that makes use of prior knowledge of a language
at human level, homophone, is proposed in this paper for automatic speech
recognition (ASR). Compared with its forerunners, the proposed method uses
pronunciation knowledge of homophones in a more complex way. End-to-end ASR
models that learn acoustic model and language model jointly and modelling units
of characters are necessary conditions for this method. Experiments with hybrid
CTC sequence-to-sequence model show that the new method can reduce character
error rate (CER) by 0.4% absolutely.
中文翻译:
端到端自动语音识别中基于同音字的标签平滑
本文提出了一种新的标签平滑方法,该方法利用人类级别语言的先验知识,同音字,用于自动语音识别 (ASR)。与其前身相比,所提出的方法以更复杂的方式使用了同音字的发音知识。联合学习声学模型和语言模型的端到端 ASR 模型以及对字符进行建模是该方法的必要条件。混合 CTC 序列到序列模型的实验表明,新方法可以将字符错误率 (CER) 绝对降低 0.4%。
更新日期:2020-05-15
中文翻译:
端到端自动语音识别中基于同音字的标签平滑
本文提出了一种新的标签平滑方法,该方法利用人类级别语言的先验知识,同音字,用于自动语音识别 (ASR)。与其前身相比,所提出的方法以更复杂的方式使用了同音字的发音知识。联合学习声学模型和语言模型的端到端 ASR 模型以及对字符进行建模是该方法的必要条件。混合 CTC 序列到序列模型的实验表明,新方法可以将字符错误率 (CER) 绝对降低 0.4%。