Skip to main content
Log in

The Research about Recurrent Model-Agnostic Meta Learning

  • Published:
Optical Memory and Neural Networks Aims and scope Submit manuscript

Abstract

Although Deep Neural Networks (DNNs) have performed great success in machine learning domain, they usually show poorly on few-shot learning tasks, where a classifier has to quickly generalize after getting very few samples from each class. A Model-Agnostic Meta Learning (MAML) model, which is able to solve new learning tasks, only using a small number of training data. A MAML model with a Convolutional Neural Network (CNN) architecture is implemented as well, trained on the Omniglot dataset (rather than DNN), as a baseline for image classification tasks. However, our baseline model suffered from a long-period training process and relatively low efficiency. To address these problems, we introduced Recurrent Neural Network (RNN) architecture and its advanced variants into our MAML model, including Long Short-Term Memory (LSTM) architecture and its variants: LSTM-b and Gated Recurrent Unit (GRU). The experiment results, measured by ac- curacies, demonstrate a considerable improvement in image classification performance and training efficiency compared to the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Similar content being viewed by others

REFERENCES

  1. Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T. and de Freitas, N., Learning to learn by gradient descent by gradient descent, Advances in Neural Information Processing Systems, 2016, pp. 3981–3989.

  2. Bengio, Y., Simard, P., and Frasconi, P., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. Neural Networks, 1994, vol. 5, no. 2, pp. 157–166.

    Article  Google Scholar 

  3. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. and Bengio, Y., Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv:1406.1078 [cs.CL], 2014.

  4. Chung, J., Gulcehre, C., Cho, K., and Bengio, Y., Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv:1412.3555v1 [cs.NE], 2014.

  5. Colah's Blog, Understanding LSTM networks, 2015. https://colah.github.io/posts/2015-08-Understanding-LSTMs/.

  6. Edwards, H. and Storkey, A., Towards a neural statistician, arXiv:1606.02185 [stat.ML], 2016.

  7. Finn, C., Abbeel, P., and Levine, S., Model-agnostic meta-learning for fast adaptation of deep net-works, arXiv:1703.03400 [cs.LG], 2017.

  8. Gers, F.A., Schmidhuber, J., and Cummins, F., Learning to forget: Continual prediction with LSTM, 1999 Ninth International Conference on Artificial Neural Networks, 1999.

  9. Hochreiter, S., Untersuchungen zu dynamischen neuronalen netzen, Diploma, Technische Universität München, 1991.

    Google Scholar 

  10. Hochreiter, S. and Schmidhuber, J., Long short-term memory, Neural Comput., 1997, vol. 9, no. 8, pp. 1735–1780.

    Article  Google Scholar 

  11. Jozefowicz, R., Zaremba, W., and Sutskever, I., An empirical exploration of recurrent network architectures, International Conference on Machine Learning, 2015, pp. 2342–2350.

  12. Lake, B., Salakhutdinov, R., Gross, J., and Tenenbaum, J., One shot learning of simple visual concepts, Proceedings of the Annual Meeting of the Cognitive Science Society, 2011, vol. 33.

  13. Mishra, N., Rohaninejad, M., Chen, X., and Abbeel, P., Meta-learning with temporal convolutions, arXiv:1707.03141 [cs.AI], 2017.

  14. Ravi, S. and Larochelle, H., Optimization as a model for few-shot learning, 2016. https://openreview.net/pdf?id=rJY0-Kcll.

  15. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T., Meta-learning with memory- augmented neural networks, International Conference on Machine Learning, 2016, pp. 1842–1850.

  16. Visin, F., Kastner, K., Cho, K., Matteucci, M., Courville, A., and Bengio, Y., ReNet: A recurrent neural network based alternative to convolutional networks, arXiv:1505.00393 [cs.CV], 2015.

Download references

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaodong Chen.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shaodong Chen, Ziyu Niu The Research about Recurrent Model-Agnostic Meta Learning. Opt. Mem. Neural Networks 29, 56–67 (2020). https://doi.org/10.3103/S1060992X20010075

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1060992X20010075

Keywords:

Navigation