Hybrid Autoregressive Transducer (hat),arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hybrid Autoregressive Transducer (hat)
arXiv - CS - Machine Learning Pub Date : 2020-03-12 , DOI: arxiv-2003.07705
Ehsan Variani, David Rybach, Cyril Allauzen, Michael Riley

This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a time-synchronous encoderdecoder model that preserves the modularity of conventional automatic speech recognition systems. The HAT model provides a way to measure the quality of the internal language model that can be used to decide whether inference with an external language model is beneficial or not. This article also presents a finite context version of the HAT model that addresses the exposure bias problem and significantly simplifies the overall training and inference. We evaluate our proposed model on a large-scale voice search task. Our experiments show significant improvements in WER compared to the state-of-the-art approaches.

中文翻译：

混合自回归传感器（帽子）

本文提出并评估了混合自回归转换器 (HAT) 模型，这是一种时间同步编码器解码器模型，它保留了传统自动语音识别系统的模块化。HAT 模型提供了一种衡量内部语言模型质量的方法，可用于确定使用外部语言模型进行推理是否有益。本文还介绍了 HAT 模型的有限上下文版本，该版本解决了暴露偏差问题并显着简化了整体训练和推理。我们在大规模语音搜索任务上评估我们提出的模型。我们的实验表明，与最先进的方法相比，WER 有了显着的改进。

更新日期：2020-03-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>