Skip to main content
Log in

Gabor capsule network with preprocessing blocks for the recognition of complex images

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Capsule network (CapsNet) is a novel concept demonstrating the importance of learning spatial hierarchical relationship between features for the effective recognition of images. However, the baseline capsule network is not suitable for the recognition of complex images leading to its poor performance on such images. This limitation can partially be attributed to the inability of CapsNets to extract important features from the input images as well as the attempt to account for every object in the image including background objects. To address these problems, we propose a variant of a capsule network that is less complex yet robust with strong feature extraction capabilities. The model uses the advantages of Gabor filter and custom preprocessing block to learn the structure and semantic information in the image. This enhances the extraction of only important features, resulting in improved activation diagrams that enable meaningful hierarchical information to be learned. Experimental results show that the proposed model can achieve 85.24%, 68.17%, 94.78% and 91.50% test accuracies on complex images such as CIFAR 10, CIFAR 100, fashion-MNIST and kvasir-dataset-v2 datasets, respectively. The performance of the proposed model is comparable to that of the state-of-the-art models on the five datasets with a relatively small number of parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. NIH: Digestive Diseases Statistics for the United States, Digestive Diseases Statistics for the United States (2020). [Online]. https://www.niddk.nih.gov/health-information/health-statistics/digestive-diseases

  2. Sabour, S., Frosst, N., Hinton, G. E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)

  3. Coughlan, G., Flanagan, E., Jeffs, S., Bertoux, M., Spiers, H., Mioshi, E., Hornberger, M.: Diagnostic relevance of spatial orientation for vascular dementia. Dement. Neuropsychol. 12(1), 85–91 (2018)

    Article  Google Scholar 

  4. Singh, A., Sengupta, S., Lakshminarayanan, V.: Explainable deep learning models in medical image analysis. J. Imaging 6(6), 1–18 (2020)

    Article  Google Scholar 

  5. Cao, S., Yao, Y., An, G.: E2-capsule neural networks for facial expression recognition using AU-aware attention 1–2 (2019)

  6. Kwabena Patrick, M., Felix Adebayo, A., Abra Mighty, A., Edward, B.Y.: Capsule networks-a survey. J. King Saud Univ. Comput. Inf. (2019). https://doi.org/10.1016/j.jksuci.2019.09.0141319-1578

    Article  Google Scholar 

  7. Xi, E., Bing, S., Jin, Y.: Capsule Network Performance on Complex Data 10707(Fall), 1–7 (2017)

    Google Scholar 

  8. Chang, S., Liu, J.: Multi-lane capsule network for classifying images with complex background. IEEE Access 8, 79876–79886 (2020)

    Article  Google Scholar 

  9. Xiang, C., Zhang, L., Zou, W., Tang, Y., Xu, C.: MS-CapsNet: A novel multi-scale capsule network. IEEE Signal Process. Lett. 1 (2018)

  10. Zhao, Z., Kleinhans, A., Sandhu, G., Patel, I., Unnikrishnan, K. P.: Fast Inference in Capsule Networks Using Accumulated Routing Coefficients, pp. 1–13 (2019)

  11. Zhao, Z., Kleinhans, A., Sandhu, G., Patel, I.K., Unnikrishnan, P.: Capsule Networks with Max–Min Normalization, pp. 1–15 (2019)

  12. Jiang, X., Wang, Y., Liu, W., Li, S., Liu, J.: CapsNet, CNN, FCN: comparative performance evaluation for image classification. Int. J. Mach. Learn. Comput. 9(6), 840–848 (2019)

    Article  Google Scholar 

  13. Ding, X., Wang, N., Gao, X., Li, J., Wang, X.: Group reconstruction and max-pooling residual capsule network. In: International Joint Conferences on Artificial Intelligence, vol. 2019-August, pp. 2237–2243 (2019)

  14. Chang, Y., Chen, W., Huang, Z., Shen, Q.: Gastrointestinal tract diseases detection with deep attention neural network. In: MM 2019—Proceedings of 27th ACM International Conference on Multimedia, pp. 2568–2572 (2019)

  15. Asperti, A., Mastronardo, C.: The effectiveness of data augmentation for detection of gastrointestinal diseases from endoscopical images. In: BIOIMAGING 2018—5th International Conference on Bioimaging, Proceedings; Part 11th International Joint Conference on Biomedical Engineering Systems and Technologies BIOSTEC 2018, vol. 2, pp. 199–205 (2018)

  16. Zhang, X., et al.: Real-time gastric polyp detection using convolutional neural networks. PLoS One 14(3), 1–16 (2019)

    Google Scholar 

  17. Gabor, D.: Theory of communication. J. Inst. Electr. Eng. Part III Radio Commun. Eng. 93(26), 429–441 (1946)

  18. Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)

    Article  Google Scholar 

  19. Patrick, M.K., Weyori, B.A., Mighty, A.A.: Max-pooled fast learning Gabor capsule network. In: 2020 International Conference on Artificial Intelligence, pp. 1–8. Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa (2020)

  20. Buades, A., Coll, B., Morel, J. M.: Denoising image sequence does not require motion estimation. In: IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 70–74 (2005)

  21. Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images (2009)

  22. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, pp. 1–6 (2017)

  23. Pogorelov, K., Randel, K.R., Griwodz, C. et al., Kvasir: a multi-class image dataset for computer aided gastrointestinal disease detection. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 164–169 (2017)

  24. Meyes, R., Lu, M., de Puiseau, C. W., Meisen, T.: Ablation Studies in Artificial Neural Networks, pp. 1–19 (2019)

  25. Xie, N., Ras, G., van Gerven, M., Doran, D.: Explainable Deep Learning: A Field Guide for the Uninitiated (2020)

  26. Shahroudnejad, A., Afshar, P., Plataniotis, K. N., Mohammadi, A.: Improved explainability of capsule networks: relevance path by agreement. In: 2018 IEEE Global Conference on Signal and Information Processing Global. 2018—Proceedings, pp. 549–553 (2019)

  27. García-Alonso, C.R., Pérez-Naranjo, L.M., Fernández-Caballero, J.C.: Visualizing data using t-SNE. Ann. Oper. Res. 219(1), 187–202 (2014)

    Article  MathSciNet  Google Scholar 

  28. Tsai, Y.-H.H., Srivastava, N., Goh, H., Salakhutdinov, R: Capsules with Inverted Dot-Product Attention Routing, pp. 1–15 (2020)

  29. Han, T., Sun, R., Shao, F., Sui, Y.: Feature and spatial relationship coding capsule network. J. Electron. Imaging 29(02), 1 (2020)

    Article  Google Scholar 

  30. Ahmed, K., Torresani, L.: STAR-CAPS: Capsule Networks with Straight-Through Attentive Routing, no. NeurIPS, pp. 1–10 (2019)

  31. Yang, S., et al.: RS-CapsNet: an advanced capsule network. IEEE Access 8, 85007–85018 (2020)

    Article  Google Scholar 

  32. Ozcan, B., Kınlı, F., Kıraç, F.: Quaternion Capsule Networks. arXiv Prepr. arXiv.2007.04389 (2020)

  33. Deborshi, G., Sun, R.: Application of Capsule Networks for Image Classification on Complex Datasets (2019)

  34. Krizhevsky, A., Sutskever, L., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 1–9 (2012)

  35. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR Conference (2015)

  36. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016—December, pp. 770–778 (2016)

Download references

Funding

This research was supported by the National Natural Science Foundation of China (NSFC Grant No. 61550110248); Research on Sino-Tibetan multi-source information acquisition, fusion, data mining and its application (Grant No. H04W170186) and Sichuan Science and Technology Program (Grant No. 2019YFG0190).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongbin Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abra Ayidzoe, M., Yu, Y., Mensah, P.K. et al. Gabor capsule network with preprocessing blocks for the recognition of complex images. Machine Vision and Applications 32, 91 (2021). https://doi.org/10.1007/s00138-021-01221-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-021-01221-6

Keywords

Navigation