Theory of Gating in Recurrent Neural Networks

Open Access

Theory of Gating in Recurrent Neural Networks

Kamesh Krishnamurthy, Tankut Can, and David J. Schwab

Phys. Rev. X 12, 011011 – Published 18 January 2022

Abstract

Recurrent neural networks (RNNs) are powerful dynamical models, widely used in machine learning (ML) and neuroscience. Prior theoretical work has focused on RNNs with additive interactions. However, gating, i.e., multiplicative, interactions are ubiquitous in real neurons and also the central feature of the best-performing RNNs in ML. Here, we show that gating offers flexible control of two salient features of the collective dynamics: (i) timescales and (ii) dimensionality. The gate controlling timescales leads to a novel, marginally stable state, where the network functions as a flexible integrator. Unlike previous approaches, gating permits this important function without parameter fine-tuning or special symmetries. Gates also provide a flexible, context-dependent mechanism to reset the memory trace, thus complementing the memory function. The gate modulating the dimensionality can induce a novel, discontinuous chaotic transition, where inputs push a stable system to strong chaotic activity, in contrast to the typically stabilizing effect of inputs. At this transition, unlike additive RNNs, the proliferation of critical points (topological complexity) is decoupled from the appearance of chaotic dynamics (dynamical complexity). The rich dynamics are summarized in phase diagrams, thus providing a map for principled parameter initialization choices to ML practitioners.

6 More

Received 15 March 2021
Revised 21 June 2021
Accepted 11 November 2021

DOI:https://doi.org/10.1103/PhysRevX.12.011011

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Chaos Collective dynamics Dynamical phase transitions Dynamics of networks Neuronal dynamics

Disordered systems High dimensional systems

Bifurcation analysis Chaos & nonlinear dynamics Dynamical mean field theory Ergodic theory Perturbation theory Random matrix theory Replica methods Stochastic analysis

Interdisciplinary PhysicsNonlinear DynamicsStatistical Physics & Thermodynamics

Authors & Affiliations

Kamesh Krishnamurthy ^1,*, Tankut Can ^2,†, and David J. Schwab³

¹Joseph Henry Laboratories of Physics and PNI, Princeton University, Princeton, New Jersey 08544, USA
²Institute for Advanced Study, Princeton, New Jersey 08540, USA
³Initiative for Theoretical Sciences, Graduate Center, CUNY, New York, New York 10016, USA

^*Lead and corresponding author. kameshk@princeton.edu
^†Corresponding author. tankut.can@gmail.com

Popular Summary

Recurrent neural networks (RNNs) are responsible for impressive advances in modern artificial intelligence, and they are widely used to model collective behavior of neurons in brains. The success of modern RNNs can be largely attributed to one key feature: gating, a multiplicative interaction (also present in real neurons) that controls the flow of information. Despite this spectacular success, it is unclear how gating shapes the behavior of RNNs. We provide a comprehensive theory of gating in RNNs, which gives engineers a road map to push the state of the art in applications and provides neuroscientists with new insights into the collective behavior of neurons.

Gating either suppresses or allows input into a neuron, whether real or artificial. In RNNs, this ability leads to greatly improved information processing abilities. In our work, we study how this affects the dynamics of a model RNN by extending a classic RNN model to include gates that influence timescale and dimensionality. The timescale gate leads to a novel, marginally stable state that can functionally serve as a neural integrator. The dimensionality gate gives rise to an input-induced transition to chaos. The interplay of these dynamical phenomena with inputs allows gated RNNs to serve as flexible memories.

Our work demonstrates the power of physics-based approaches in studying models in machine learning and neuroscience and further invigorates the intense activity at this fertile interface.

Key Image

Article Text

Click to Expand

References

Click to Expand

Issue

Vol. 12, Iss. 1 — January - March 2022

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X