当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic temporal residual network for sequence modeling
International Journal on Document Analysis and Recognition ( IF 2.3 ) Pub Date : 2019-07-02 , DOI: 10.1007/s10032-019-00328-x
Ruijie Yan , Liangrui Peng , Shanyu Xiao , Michael T. Johnson , Shengjin Wang

The long short-term memory (LSTM) network with gating mechanism has been widely used in sequence modeling tasks including handwriting and speech recognition. As an LSTM network can be unfolded along the temporal dimension and its temporal depth is equal to the length of the input feature sequence, the introduction of gating might not be sufficient to completely model the dynamic temporal dependencies in sequential data. Inspired by the residual learning in ResNet, this paper proposes a dynamic temporal residual network (DTRN) by incorporating residual learning into an LSTM network along the temporal dimension. DTRN involves two networks: Its primary network consists of modified LSTM units with weighted shortcut connections for adjacent temporal outputs, while its secondary network generates dynamic weights for the shortcut connections. To validate the performance of DTRN, we conduct experiments on three commonly used public handwriting recognition datasets (IFN/ENIT, IAM and Rimes) and one speech recognition dataset (TIMIT). The experimental results show that the proposed DTRN has outperformed previously reported methods.

中文翻译:

动态时序残差网络用于序列建模

具有门控机制的长短期记忆(LSTM)网络已广泛用于包括手写和语音识别在内的序列建模任务。由于LSTM网络可以沿时间维度展开,并且其时间深度等于输入要素序列的长度,因此门控的引入可能不足以完全建模顺序数据中的动态时间相关性。受ResNet中残差学习的启发,本文通过将残差学习沿时间维度合并到LSTM网络中,提出了动态时态残差网络(DTRN)。DTRN涉及两个网络:DTRN的主要网络由经过修改的LSTM单元组成,这些LSTM单元具有用于相邻时间输出的加权快捷连接,而辅助网络则为快捷连接生成动态权重。为了验证DTRN的性能,我们在三个常用的公共手写识别数据集(IFN / ENIT,IAM和Rimes)和一个语音识别数据集(TIMIT)上进行了实验。实验结果表明,提出的DTRN优于以前报道的方法。
更新日期:2019-07-02
down
wechat
bug