Learning deep hierarchical and temporal recurrent neural networks with residual learning,International Journal of Machine Learning and Cybernetics

当前位置： X-MOL 学术 › Int. J. Mach. Learn. & Cyber. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning deep hierarchical and temporal recurrent neural networks with residual learning
International Journal of Machine Learning and Cybernetics ( IF 3.1 ) Pub Date : 2020-01-29 , DOI: 10.1007/s13042-020-01063-0
Tehseen Zia , Assad Abbas , Usman Habib , Muhammad Sajid Khan

Learning both hierarchical and temporal dependencies can be crucial for recurrent neural networks (RNNs) to deeply understand sequences. To this end, a unified RNN framework is required that can ease the learning of both the deep hierarchical and temporal structures by allowing gradients to propagate back from both ends without being vanished. The residual learning (RL) has appeared as an effective and less-costly method to facilitate backward propagation of gradients. The significance of the RL is exclusively shown for learning deep hierarchical representations and temporal dependencies. Nevertheless, there is lack of efforts to unify these finding into a single framework for learning deep RNNs. In this study, we aim to prove that approximating identity mapping is crucial for optimizing both hierarchical and temporal structures. We propose a framework called hierarchical and temporal residual RNNs, to learn RNNs by approximating identity mappings across hierarchical and temporal structures. To validate the proposed method, we explore the efficacy of employing shortcut connections for training deep RNNs structures for sequence learning problems. Experiments are performed on Penn Treebank, Hutter Prize and IAM-OnDB datasets and results demonstrate the utility of the framework in terms of accuracy and computational complexity. We demonstrate that even for large datasets exploiting parameters for increasing network depth can gain computational benefits with reduced size of the RNN "state".

中文翻译：

通过残差学习来学习深度层次和时间递归神经网络

学习分层和时间相关性对于循环神经网络（RNN）深刻理解序列至关重要。为此，需要一个统一的RNN框架，该框架可以通过允许梯度从两端传播回而不会消失，从而简化对深层次结构和时间结构的学习。残差学习（RL）似乎是一种有效且成本较低的方法，可促进梯度的向后传播。RL的意义专门用于学习深度层次表示和时间依存关系。尽管如此，仍缺乏努力将这些发现统一为一个用于学习深度RNN的单一框架。在这项研究中，我们旨在证明近似身份映射对于优化层次结构和时间结构至关重要。我们提出了一个称为分层和时间残余RNN的框架，以通过近似跨越分层和时间结构的身份映射来学习RNN。为了验证所提出的方法，我们探索了使用快捷连接来训练序列学习问题的深度RNN结构的功效。在Penn Treebank，Hutter Prize和IAM-OnDB数据集上进行了实验，结果证明了该框架在准确性和计算复杂性方面的实用性。我们证明，即使对于大型数据集，利用增加的网络深度的参数也可以通过减小RNN“状态”的大小来获得计算优势。我们探索了使用快捷连接来训练序列学习问题的深层RNN结构的功效。在Penn Treebank，Hutter Prize和IAM-OnDB数据集上进行了实验，结果证明了该框架在准确性和计算复杂性方面的实用性。我们证明，即使对于大型数据集，利用增加的网络深度的参数也可以通过减小RNN“状态”的大小来获得计算优势。我们探索了使用快捷连接来训练序列学习问题的深层RNN结构的功效。在Penn Treebank，Hutter Prize和IAM-OnDB数据集上进行了实验，结果证明了该框架在准确性和计算复杂性方面的实用性。我们证明，即使对于大型数据集，利用参数来增加网络深度也可以通过减小RNN“状态”的大小来获得计算优势。

更新日期：2020-01-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11