当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory
arXiv - CS - Machine Learning Pub Date : 2020-06-29 , DOI: arxiv-2006.16800
Antonio Carta, Alessandro Sperduti, Davide Bacciu

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of hidden activations. Experimental results on synthetic and real-world datasets on speech recognition and handwritten characters show that the modular architecture and the incremental training algorithm improve the ability of recurrent neural networks to capture long-term dependencies.

中文翻译:

利用多尺度动态记忆的循环神经网络的增量训练

循环神经网络的有效性很大程度上受其将不同频率和时间尺度的输入序列中提取的信息存储到动态记忆中的能力影响。这种特征可以通过动态记忆的适当模块化引入到神经架构中。在本文中,我们提出了一种新颖的增量训练循环架构,明确针对多尺度学习。首先,我们展示了如何通过将其隐藏状态分成不同的模块来扩展简单 RNN 的架构,每个模块以不同的频率对网络隐藏激活进行子采样。然后,我们讨论了一种训练算法,其中将新模块迭代添加到模型中以逐渐学习更长的依赖关系。每个新模块的工作频率都比以前的低,并且它被初始化为对隐藏激活的子采样序列进行编码。在语音识别和手写字符的合成和真实世界数据集上的实验结果表明,模块化架构和增量训练算法提高了循环神经网络捕获长期依赖关系的能力。
更新日期:2020-07-01
down
wechat
bug