Skip to main content
Log in

Path Capsule Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Capsule network (CapsNet) was introduced as an enhancement over convolutional neural networks, supplementing the latter’s invariance properties with equivariance through pose estimation. CapsNet achieved a very decent performance with a shallow architecture and a significant reduction in parameters count. However, the width of the first layer in CapsNet is still contributing to a significant number of its parameters and the shallowness may be limiting the representational power of the capsules. To address these limitations, we introduce Path Capsule Network (PathCapsNet), a deep parallel multi-path version of CapsNet. We show that a judicious coordination of depth, max-pooling, regularization by DropCircuit and a new fan-in routing by agreement technique can achieve better or comparable results to CapsNet, while further reducing the parameter count significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Bender G, Kindermans PJ, Zoph B, Vasudevan V, Le Q (2018) Understanding and simplifying one-shot architecture search. http://proceedings.mlr.press/v80/bender18a

  2. Chollet F (2016) Xception: deep learning with depthwise separable convolutions. arXiv:1610.02357

  3. Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification arXiv:1202.2745

  4. Fukushima K, Miyake S (1980) Neocognitron: Self-organizing network capable of position-invariant recognition of patterns. In: Proceedings of the 5th international conference pattern recognition, vol 1, pp 459–461

  5. Ghafoorian M, Karssemeijer N, Heskes T, van Uden IWM, Sanchez CI, Litjens G, de Leeuw FE, van Ginneken B, Marchiori E, Platel B (2017) Location sensitive deep convolutional neural networks for segmentation of white matter hyperintensities. Sci Rep 7(1):5110. https://doi.org/10.1038/s41598-017-05300-5

    Article  Google Scholar 

  6. Gollisch T, Meister M (2010) Eye smarter than scientists believed: neural computations in circuits of the retina. Neuron 65(2):150–64. https://doi.org/10.1016/j.neuron.2009.12.009

    Article  Google Scholar 

  7. He C, Peng L, Le Y, He J (2018) SECaps: a sequence enhanced capsule model for charge prediction. arXiv:1810.04465

  8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  9. Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with {EM} routing. In: International conference on learning representations. https://openreview.net/forum?id=HJWLfGWRb

  10. Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670. https://doi.org/10.1109/TIP.2015.2487860

    Article  MathSciNet  MATH  Google Scholar 

  11. Hong C, Yu J, Zhang J, Jin X, Lee KH (2019) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf 15(7):3952–3961. https://doi.org/10.1109/TII.2018.2884211

    Article  Google Scholar 

  12. Hornik K, Stinchcombe M, White H (1990) Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw 3(5):551–560. https://doi.org/10.1016/0893-6080(90)90005-6

    Article  Google Scholar 

  13. Hubel D, Wiesel T (1968) Receptive fields and functional architecture of monkey striate cortex. J Phys. https://doi.org/10.1113/jphysiol.1968.sp008455

    Article  Google Scholar 

  14. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks

  15. Larsson G, Maire M, Shakhnarovich G (2016) FractalNet: ultra-deep neural networks without residuals. arXiv:1605.07648

  16. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 4:541–551. https://doi.org/10.1162/neco.1989.1.4.541

    Article  Google Scholar 

  17. LeCun Y, Bengio Y et al. (1995) Convolutional networks for images, speech, and time series. In: The handbook of brain theory and neural networks, vol 3361(10)

  18. Mehrer J, Spoerer CJ, Kriegeskorte N, Kietzmann TC (2020) Individual differences among deep neural network models. bioRxiv. https://doi.org/10.1101/2020.01.08.898288

    Article  Google Scholar 

  19. Neill JO (2018) Siamese capsule networks. arXiv:1805.07242

  20. Otsuna H, Shinomiya K, Ito K (2014) Parallel neural pathways in higher visual centers of the Drosophila brain that mediate wavelength-specific behavior. Front Neural Circuits 8:8. https://doi.org/10.3389/fncir.2014.00008

    Article  Google Scholar 

  21. Phan KT, Maul TH, Vu TT, Kin LW (2016) Improving neural network generalization by combining parallel circuits with dropout. Comput Sci. https://doi.org/10.1007/978-3-319-46675-0. arXiv:1612.04970

    Article  Google Scholar 

  22. Phan KT, Maul TH, Vu TT, Lai WK (2018) DropCircuit: a modular regularizer for parallel circuit networks. Neural Process Lett 47(3):841–858. https://doi.org/10.1007/s11063-017-9677-4

    Article  Google Scholar 

  23. Phaye SSR, Sikka A, Dhall A, Bathula D (2018) Dense and diverse capsule networks: making the capsules learn better arXiv:1805.04001

  24. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. arXiv:1710.09829

  25. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958. https://doi.org/10.1214/12-AOS1000. arXiv:1102.4807

    Article  MathSciNet  MATH  Google Scholar 

  26. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07–12 June, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594. arXiv:1409.4842

  27. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. arXiv:1512.00567

  28. Taigman Y, Yang M, Ranzato M, Wolf L (2014) DeepFace: closing the gap to human-level performance in face verification. https://www.cv-foundation.org/openaccess/content_cvpr_2014/html/Taigman_DeepFace_Closing_the_2014_CVPR_paper.html

  29. Tang K, Paluri M, Fei-Fei L, Fergus R, Bourdev L (2015) Improving image classification with location context arXiv:1505.03873

  30. Wang M (2015) Multi-path Convolutional neural networks for complex image classification. arXiv:1506.04701

  31. Wang Z, Veksler O (2018) Location augmentation for CNN. arXiv:1807.07044

  32. Xiang C, Zhang L, Tang Y, Zou W, Xu C (2018) MS-CapsNet: a novel multi-scale capsule network. IEEE Signal Process Lett 25(12):1850–1854. https://doi.org/10.1109/LSP.2018.2873892

    Article  Google Scholar 

  33. Xie S, Girshick R, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:161105431

  34. Yu J, Li J, Yu Z, Huang Q (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2019.2947482. arXiv:1905.07841

    Article  Google Scholar 

  35. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2019.2932058

    Article  Google Scholar 

  36. Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Tran Image Process 27(5):2420–2432. https://doi.org/10.1109/TIP.2018.2804218

    Article  MathSciNet  MATH  Google Scholar 

  37. Zhang X, Huang S, Zhang X, Wang W, Wang Q, Yang D (2018) Residual inception: a new module combining modified residual with inception to improve network performance. In: 2018 25th IEEE international conference on image processing (ICIP), IEEE, pp 3039–3043. https://doi.org/10.1109/ICIP.2018.8451515

Download references

Acknowledgements

We acknowledge the use of Athena at HPC Midlands+, which was funded by the EPSRC on Grant EP/P020232/1, in this research, as part of the HPC Midlands+ consortium. This work was partially supported by a Grant from Microsoft’s AI for Earth program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Amer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amer, M., Maul, T. Path Capsule Networks. Neural Process Lett 52, 545–559 (2020). https://doi.org/10.1007/s11063-020-10273-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10273-0

Keywords

Navigation