The Research about Recurrent Model-Agnostic Meta Learning

Shaodong Chen; Ziyu Niu

doi:10.3103/S1060992X20010075

The Research about Recurrent Model-Agnostic Meta Learning

Published: 02 April 2020

Volume 29, pages 56–67, (2020)
Cite this article

Optical Memory and Neural Networks Aims and scope Submit manuscript

Shaodong Chen¹ &
Ziyu Niu²

199 Accesses
1 Citation
Explore all metrics

Abstract

Although Deep Neural Networks (DNNs) have performed great success in machine learning domain, they usually show poorly on few-shot learning tasks, where a classifier has to quickly generalize after getting very few samples from each class. A Model-Agnostic Meta Learning (MAML) model, which is able to solve new learning tasks, only using a small number of training data. A MAML model with a Convolutional Neural Network (CNN) architecture is implemented as well, trained on the Omniglot dataset (rather than DNN), as a baseline for image classification tasks. However, our baseline model suffered from a long-period training process and relatively low efficiency. To address these problems, we introduced Recurrent Neural Network (RNN) architecture and its advanced variants into our MAML model, including Long Short-Term Memory (LSTM) architecture and its variants: LSTM-b and Gated Recurrent Unit (GRU). The experiment results, measured by ac- curacies, demonstrate a considerable improvement in image classification performance and training efficiency compared to the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T. and de Freitas, N., Learning to learn by gradient descent by gradient descent, Advances in Neural Information Processing Systems, 2016, pp. 3981–3989.
Bengio, Y., Simard, P., and Frasconi, P., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. Neural Networks, 1994, vol. 5, no. 2, pp. 157–166.
Article Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. and Bengio, Y., Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv:1406.1078 [cs.CL], 2014.
Chung, J., Gulcehre, C., Cho, K., and Bengio, Y., Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv:1412.3555v1 [cs.NE], 2014.
Colah's Blog, Understanding LSTM networks, 2015. https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
Edwards, H. and Storkey, A., Towards a neural statistician, arXiv:1606.02185 [stat.ML], 2016.
Finn, C., Abbeel, P., and Levine, S., Model-agnostic meta-learning for fast adaptation of deep net-works, arXiv:1703.03400 [cs.LG], 2017.
Gers, F.A., Schmidhuber, J., and Cummins, F., Learning to forget: Continual prediction with LSTM, 1999 Ninth International Conference on Artificial Neural Networks, 1999.
Hochreiter, S., Untersuchungen zu dynamischen neuronalen netzen, Diploma, Technische Universität München, 1991.
Google Scholar
Hochreiter, S. and Schmidhuber, J., Long short-term memory, Neural Comput., 1997, vol. 9, no. 8, pp. 1735–1780.
Article Google Scholar
Jozefowicz, R., Zaremba, W., and Sutskever, I., An empirical exploration of recurrent network architectures, International Conference on Machine Learning, 2015, pp. 2342–2350.
Lake, B., Salakhutdinov, R., Gross, J., and Tenenbaum, J., One shot learning of simple visual concepts, Proceedings of the Annual Meeting of the Cognitive Science Society, 2011, vol. 33.
Mishra, N., Rohaninejad, M., Chen, X., and Abbeel, P., Meta-learning with temporal convolutions, arXiv:1707.03141 [cs.AI], 2017.
Ravi, S. and Larochelle, H., Optimization as a model for few-shot learning, 2016. https://openreview.net/pdf?id=rJY0-Kcll.
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T., Meta-learning with memory- augmented neural networks, International Conference on Machine Learning, 2016, pp. 1842–1850.
Visin, F., Kastner, K., Cho, K., Matteucci, M., Courville, A., and Bengio, Y., ReNet: A recurrent neural network based alternative to convolutional networks, arXiv:1505.00393 [cs.CV], 2015.

Download references

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Nanyang Institute of Technology, Nanyang, Henan, China
Shaodong Chen
Artificial Intelligence, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Ziyu Niu

Authors

Shaodong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ziyu Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaodong Chen.

About this article

Cite this article

Shaodong Chen, Ziyu Niu The Research about Recurrent Model-Agnostic Meta Learning. Opt. Mem. Neural Networks 29, 56–67 (2020). https://doi.org/10.3103/S1060992X20010075

Download citation

Received: 12 October 2019
Revised: 12 October 2019
Accepted: 30 December 2019
Published: 02 April 2020
Issue Date: January 2020
DOI: https://doi.org/10.3103/S1060992X20010075

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions