Skip to main content
Log in

Hyper Autoencoders

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We introduce the hyper autoencoder architecture where a secondary, hypernetwork is used to generate the weights of the encoder and decoder layers of the primary, actual autoencoder. The hyper autoencoder uses a one-layer linear hypernetwork to predict all weights of an autoencoder by taking only one embedding vector as input. The hypernetwork is smaller and as such acts as a regularizer. Just like the vanilla autoencoder, the hyper autoencoder can be used for unsupervised or semi-supervised learning. In this study, we also present a semi-supervised model using a combination of convolutional neural networks and autoencoders with the hypernetwork. Our experiments on five image datasets, namely, MNIST, Fashion MNIST, LFW, STL-10 and CelebA, show that the hyper autoencoder performs well on both unsupervised and semi-supervised learning problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Alain G, Bengio Y (2014) What regularized auto-encoders learn from the data-generating distribution. J Mach Learn Res 15(1):3563–3593

    MathSciNet  MATH  Google Scholar 

  2. Bengio Y, Yao L, Alain G, Vincent P (2013) Generalized denoising auto-encoders as generative models. Adv Neural Inf Process Syst 899–907

  3. Bharati A, Vatsa M, Singh R, Bowyer KW, Tong X (2017) Demography-based facial retouching detection using subclass supervised sparse autoencoder. In: IEEE International Joint Conference on Biometrics, pp 474–482

  4. Bourlard H, Kamp Y (1988) Auto-association by multilayer perceptrons and singular value decomposition. Biol Cybern 59(4–5):291–294

    Article  MathSciNet  Google Scholar 

  5. Brabandere B, Jia X, Tuytelaars T, Gool LV (2016) Dynamic filter networks. Adv Neural Inf Process Syst 667–675

  6. Brock A, Lim T, Ritchie JM, Weston N (2017) SMASH: one-shot model architecture search through hypernetworks, arXiv:1708.05344

  7. Coates A, Lee H, Ng A (2011) An analysis of single layer networks in unsupervised feature learning. In: Proceedings of the fourteenth international conference on artificial intelligence, pp 215–223

  8. Cottrell GW, Munro P, Zipser D (1987) Learning internal representations from gray-scale images: an example of extensional programming. In: Proceedings Ninth annual conference of the cognitive science society, pp 462–473

  9. Dai A, Le Q (2015) Semi-supervised sequence learning. Adv Neural Inf Process Syst pp 3079–3087

  10. Dhaka AK, Salvi G (2017) Sparse autoencoder based semi-supervised learning for phone classification with limited annotations. In: International workshop on grounding language understanding, pp 22–26

  11. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  12. Ha D, Dai A, Le QV (2017) Hypernetworks. In: International conference on learning representations (ICLR)

  13. Hinton GE, Zemel RS (1994) Autoencoders, minimum description length, and Helmholtz free energy. Adv Neural Inf Process Syst pp 3–10

  14. Huang G, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments, University of Massachusetts, Amherst, Technical Report, pp 07–49

  15. Irsoy O, Alpaydın E (2016) Autoencoder trees. In: Asian conference on machine learning, pp. 378–390

  16. Jiang L, Ge Z, Song Z (2017) Semi-supervised fault classification based on dynamic sparse stacked auto-encoders model. Chemometr Intell Lab Syst 168:72–83

    Article  Google Scholar 

  17. Kamimura R, Takeuchi H (2017) Supervised semi-autoencoder learning for multi-layered neural networks. In: Joint 17th world congress of international fuzzy systems association and 9th international conference on soft computing and intelligent systems (IFSA-SCIS), IEEE pp 1–8

  18. Kingma D, Ba J (2014) Adam: A method for stochastic optimization, arXiv:1412.6980

  19. Krueger D, Huang CW, Islam R, Turner R, Lacoste A, Courville A (2017) Bayesian hypernetworks, arXiv:1710.04759

  20. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  21. Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML

  22. Liu B, Yu X, Zhang P, Tan X, Yu A, Xue Z (2017) A semi-supervised convolutional neural network for hyperspectral image classification. Remote Sens Lett 8(9):839–848

    Article  Google Scholar 

  23. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3730–3738

  24. Lorraine J, Duvenaud D (2018) Stochastic hyperparameter optimization through hypernetworks, arXiv:1802.09419

  25. Makhzani A, Frey BJ (2017) PixelGAN autoencoders, Adv Neural Inf Process Syst, 1975–1985

  26. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B (2015) Adversarial autoencoders, arXiv:1511.05644

  27. Ranzato M, Poultney C, Chopra S, LeCun Y (2007) Efficient learning of sparse representations with an energy-based model. Adv Neural Inf Process Syst 1137–1144

  28. Mohammadi M, Al-Fuqaha A, Guizani M, Oh JS (2017) Semi-supervised deep reinforcement learning in support of IoT and smart city services. IEEE Int Things J 5(2):624–635

    Article  Google Scholar 

  29. O’shea TJ, West N, Vondal M, Clancy TC (2017) Semi-supervised radio signal identification. In: 19th International conference on advanced communication technology, IEEE, pp 33–38

  30. Park H, Yoo CD (2017) Early improving recurrent elastic highway network, arXiv:1708.04116

  31. Rampasek L, Hidru D, Smirnov P, Haibe-Kains B, Goldenberg A (2017) Dr. VAE: Drug response variational autoencoder. arXiv:1706.08203

  32. Ranzato MA, Szummer M (2008) Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on machine learning, pp 792–799

  33. Rasmus A, Valpola H, Honkala M, Berglund M, Raiko T (2015) Semi-supervised learning with ladder networks. Adv Neural Inf Process Syst pp 3546–3554

  34. Sanderson C, Lovell BC (2009) Multi-region probabilistic histograms for robust and scalable identity inference. In: International conference on biometrics, pp 199–208

  35. Sheikh AS, Rasul K, Merentitis A, Bergmann U (2017) Stochastic maximum likelihood optimization via hypernetworks, arXiv:1712.01141

  36. Socher R, Pennington J, Huang EH, Ng A, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, association for computational linguistics, pp 151–161

  37. Soydaner D (2018) Training deep neural network based hyper autoencoders with machine learning methods, PhD dissertation, Mimar Sinan Fine Arts University, Institute of Science and Technology, İstanbul, Turkey

  38. Suarez J (2017) Character-level language modeling with recurrent highway hypernetworks. Adv Neural Inf Process Syst 3267–3276

  39. Sun Z, Ozay M, Okatani T (2017) Hypernetworks with statistical filtering for defending adversarial examples, arXiv:1711.01791

  40. Tietz M, Alpay T, Twiefel J, Wermter S (2017) Semi-supervised phoneme recognition with recurrent ladder networks. In: International conference on artificial neural networks, Springer, Cham, pp 3–10

  41. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  42. Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, arXiv:1708.07747

  43. Yang Z, Hu Z, Salakhutdinov R, Berg-Kirkpatrick T (2017) Improved variational autoencoders for text modeling using dilated convolutions. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 3881–3890

  44. Zhang Q, Liu X, Fu J (2018) Neural networks incorporating dictionaries for chinese word segmentation. In: Thirty-Second AAAI Conference on Artificial Intelligence

  45. Zhang X, Jiang Y, Peng H, Tu K, Goldwasser D (2017) Semi-supervised structured prediction with neural CRF autoencoder. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 1701–1711

Download references

Acknowledgements

I would like to thank Ethem Alpaydın for his guidance and helpful suggestions throughout this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Derya Soydaner.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Soydaner, D. Hyper Autoencoders. Neural Process Lett 52, 1395–1413 (2020). https://doi.org/10.1007/s11063-020-10310-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10310-y

Keywords

Navigation