当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating RNN Transducer Inference via Adaptive Expansion Search
IEEE Signal Processing Letters ( IF 3.2 ) Pub Date : 2020-01-01 , DOI: 10.1109/lsp.2020.3036335
Juntae Kim , Yoonhan Lee , Eesung Kim

Recurrent neural network transducers (RNN-T) are a promising end-to-end speech recognition framework that transduce input acoustic frames to a character sequence. Best- and breadth-first searches have been used as decoding strategies for RNN-T. However, best-first search follows a sequential process for its expansion search, which slows down the decoding process. Although breadth-first search replaces the sequential process of best-first search with a parallel one, it unnecessarily conducts an expansion search for all decoding steps. As most of the decoding frames correspond to a blank symbol because the length of the character sequence is much shorter than that of the decoding frames, this induces computational overhead. To address these limitations, we introduce an adaptive expansion search (AES) to accelerate RNN-T inference. AES overcomes the aforementioned limitations by batching the hypotheses and adopting a decision-making process that decides whether to continue the expansion search; thus, AES can avoid unnecessary expansion search. Furthermore, pruning is applied to AES for further acceleration. We achieved significant speedup and a lower word error rate compared with other baselines.

中文翻译:

通过自适应扩展搜索加速 RNN 传感器推理

循环神经网络传感器 (RNN-T) 是一种很有前途的端到端语音识别框架,可将输入的声学帧转换为字符序列。最佳和广度优先搜索已被用作 RNN-T 的解码策略。然而,最佳优先搜索遵循其扩展搜索的顺序过程,这会减慢解码过程。尽管广度优先搜索用并行搜索替换了最佳优先搜索的顺序过程,但它不必要地对所有解码步骤进行了扩展搜索。由于字符序列的长度比解码帧的长度短得多,因此大多数解码帧对应于一个空白符号,这会导致计算开销。为了解决这些限制,我们引入了自适应扩展搜索 (AES) 来加速 RNN-T 推理。AES通过对假设进行批处理并采用决定是否继续扩展搜索的决策过程来克服上述限制;因此,AES 可以避免不必要的扩展搜索。此外,修剪应用于 AES 以进一步加速。与其他基线相比,我们实现了显着的加速和更低的单词错误率。
更新日期:2020-01-01
down
wechat
bug