Skip to main content
Log in

Aggregated squeeze-and-excitation transformations for densely connected convolutional networks

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Recently, convolutional neural networks (CNNs) have achieved great success in computer vision, but suffer from parameter redundancy in large-scale networks. DenseNet is a typical CNN architecture, which connects each layer to every other layer to maximize feature reuse and network efficiency, but it can become parametrically expensive with the potential risk of overfitting in deep networks. To address these problems, we propose a lightweight Densely Connected and Inter-Sparse Convolutional Networks with aggregated Squeeze-and-Excitation transformations (DenisNet-SE) in this paper. First, Squeeze-and-Excitation (SE) blocks are introduced in different locations of the dense model to adaptively recalibrate channel-wise feature responses. Meanwhile, we propose the Squeeze-Excitation-Residual (SERE) block, which applies residual learning to construct identity mapping. Second, to construct the densely connected and inter-sparse structure, we further apply the sparse three-layer bottleneck layer and grouped convolutions, which increase the cardinality of transformations. Our proposed network is evaluated on three highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, and ImageNet) and achieves better performance than the state-of-the-art networks while requiring fewer parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Tong, M., Zhao, M., Chen, Y., Wang, H.: D\({}^{\text{3 }}\)-lnd: A two-stream framework with discriminant deep descriptor, linear CMDT and nonlinear KCMDT descriptors for action recognition. Neurocomputing 325, 90–100 (2019)

    Article  Google Scholar 

  2. Tong, M., Li, M., Bai, H., Ma, L., Zhao, M.: DKD-DAD: a novel framework with discriminative kinematic descriptor and deep attention-pooled descriptor for action recognition. Neural Comput. Appl. 32(9), 5285–5302 (2020)

    Article  Google Scholar 

  3. Rong, H., Ma, T., Cao, J., Tian, Y., Al-Dhelaan, A., Al-Rodhaan, M.: Deep rolling: A novel emotion prediction model for a multi-participant communication context. Inf. Sci. 488, 158–180 (2019)

    Article  Google Scholar 

  4. Zhang, S., He, F.: Drcdn: learning deep residual convolutional dehazing networks. Vis. Comput. 36(9), 1797–1808 (2020)

    Article  Google Scholar 

  5. Meng, M., Lan, M., Yu, J., Wu, J., Tao, D.: Constrained discriminative projection learning for image classification. IEEE Trans. Image Process. 29, 186–198 (2020)

    Article  MathSciNet  Google Scholar 

  6. Liu, Y., Dou, Y., Jin, R., Li, R., Qiao, P.: Hierarchical learning with backtracking algorithm based on the visual confusion label tree for large-scale image classification. Vis. Comput. 1–21, (2021)

  7. Abbass, M. Y., Kwon, K.-C., Kim, N., Abdelwahab, S. A., Abd El-Samie, F. E., Khalaf, A. A.: Efficient object tracking using hierarchical convolutional features model and correlation filters. Vis Comput. 1–12, (2020)

  8. Bao, W., Xu, B., Chen, Z.: Monofenet: Monocular 3d object detection with feature enhancement networks. IEEE Trans. Image Process. 29, 2753–2765 (2020)

    Article  Google Scholar 

  9. Tu, Z., Xie, W., Dauwels, J., Li, B., Yuan, J.: Semantic cues enhanced multimodality multistream CNN for action recognition. IEEE Trans. Circuits Syst. Video Technol. 29(5), 1423–1437 (2019)

    Article  Google Scholar 

  10. Alipour, N., Behrad, A.: Semantic segmentation of JPEG blocks using a deep CNN for non-aligned JPEG forgery detection and localization. Multim. Tools Appl. 79(11–12), 8249–8265 (2020)

    Article  Google Scholar 

  11. Hao, S., Zhou, Y., Guo, Y.: A brief survey on semantic segmentation with deep learning. Neurocomputing 406, 302–321 (2020)

    Article  Google Scholar 

  12. TinghuaiMa, H.Y.A.-D.M.-R.: YuWeiZhao, “Natural disaster topic extraction in sina microblogging based on graph analysis.” Expert Systems with Applications 115, 346–355 (2019)

  13. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. Proc. Neural Inform. Process. Syst. 1106–1114 (2012)

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In Proceedings of International Conference on Learning Representations, (2015)

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proc. Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  18. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of International Conference on Machine Learning, F. R. Bach and D. M. Blei, Eds., vol. 37, pp. 448–456 (2015)

  19. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A. A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proc. AAAI Conference on Artificial Intelligence, S. P. Singh and S. Markovitch, Eds., pp. 4278–4284 (2017)

  20. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Proceedings of British Machine Vision Conference, (2016)

  21. Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K. Q.: Deep networks with stochastic depth. CoRR, vol.arXiv:abs/1603.09382,(2016)

  22. Shen, Y., Xu, X., Cao, J.: Reconciling predictive and interpretable performance in repeat buyer prediction via model distillation and heterogeneous classifiers fusion. Neural Comput. Appl. 32(13), 9495–9508 (2020)

    Article  Google Scholar 

  23. Pérez, A. F., Sanguineti, V., Morerio, P., Murino, V.: Audio-visual model distillation using acoustic images. In: Proceedinds Winter Conference on Applications of Computer Vision. IEEE, pp. 2843–2852 (2020)

  24. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K. Q.: Densely connected convolutional networks. In Proc. Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp. 2261–2269 (2017)

  25. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  26. Xie, S., Girshick, R. B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 5987–5995 (2017)

  27. Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: Attention over convolution kernels. In: Proc. Conference on Computer Vision and Pattern Recognition, pp. 11 030–11 039 (2020)

  28. Srivastava, R. K., Greff, K., Schmidhuber, J.: Training very deep networks. In: Proc. Neural Information Processing Systems, pp. 2377–2385 (2015)

  29. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In Proceedings of Neural Information Processing Systems, pp. 2017–2025 (2015)

  30. Wang, X., Girshick, R. B., Gupta, A., He, K.: Non-local neural networks. In Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

  31. Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 1971–1980 (2019)

  32. Hu, Y., Wen, G., Luo, M., Dai, D., Ma, J.: Competitive inner-imaging squeeze and excitation for residual network. CoRR, vol.arXiv:abs/1807.08920,(2018)

  33. Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 6450–6458 (2017)

  34. Woo, S., Park, J., Lee, J., Kweon, I. S.: CBAM: convolutional block attention module. In: Proc. European Conference on Computer Vision, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., vol. 11211, pp. 3–19 (2018)

  35. Yu, C., He, X., Ma, H., Qi, X., Lu, J., Zhao, Y.: S-densenet: A densenet compression model based on convolution grouping strategy using skyline method. IEEE Access, vol. 7, pp. 183 604–183 613, (2019)

  36. Huang, G., Liu, S., van der Maaten, L., Weinberger, K. Q.: Condensenet: An efficient densenet using learned group convolutions. In Proc. Conference on Computer Vision and Pattern Recognition, pp. 2752–2761 (2018)

  37. Yang, H., Kim, J., Kim, H., Adhikari, S.P.: Guided soft attention network for classification of breast cancer histopathology images. IEEE Trans. Med. Imag. 39(5), 1306–1315 (2020)

    Article  Google Scholar 

  38. Son, L. H., Kumar, A., Sangwan, S. R., Arora, A., Nayyar, A., Abdel-Basset, M.: Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access, vol. 7, pp. 23 319–23 328, (2019)

  39. Ma, T., Zhou, H., Tian, Y., Al-Nabhan, N.: A novel rumor detection algorithm based on entity recognition, sentence reconfiguration, and ordinary differential equation network. Neurocomputing 447, 224 (2021)

    Article  Google Scholar 

  40. Zhao, B., Wu, X., Feng, J., Peng, Q., Yan, S.: Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimedia 19(6), 1245–1256 (2017)

    Article  Google Scholar 

  41. Liu, X., Xia, T., Wang, J., Lin, Y.: Fully convolutional attention localization networks: Efficient attention localization for fine-grained recognition. CoRR, vol.arXiv:abs/1603.06765 ,(2016)

  42. Ma, T., Wang, H., Zhang, L., Tian, Y., Al-Nabhan, N.: Graph classification based on structural features of significant nodes and spatial convolutional neural networks. Neurocomputing 423, 639–650 (2021)

    Article  Google Scholar 

  43. Sehovac, L., Grolinger, K.: Deep learning for load forecasting: Sequence to sequence recurrent neural networks with attention. IEEE Access, vol. 8, pp. 36 411–36 426, (2020)

  44. J.-Y. Y. H., A.-R. T. M., Huan Rong, Tinghuai Ma.: A novel sentiment polarity detection framework for chinese. IEEE Trans. Affect. Comput., (2019)

  45. Qin, L., Tinghuai, M.: Lgiem Global and local node influence based community detection. Fut. Gener. Comput. Syst. 105, 533–546 (2020)

    Article  Google Scholar 

  46. Ma, T., Jia, J., Xue, Y., Tian, Y., Al-Dhelaan, A., Al-Rodhaan, M.: Protection of location privacy for moving knn queries in social networks. Appl. Soft Comput. 66, 525–532 (2018)

    Article  Google Scholar 

  47. Liu, X., Xu, Q., Wang, N.: A survey on deep neural network-based image captioning. Vis. Comput. 35(3), 445–470 (2019)

    Article  Google Scholar 

  48. Ma, T., Shao, W., Hao, Y., Cao, J.: Graph classification based on graph set reconstruction and graph kernel feature reduction. Neurocomputing 296, 33–45 (2018)

    Article  Google Scholar 

  49. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Tech Report, (2009)

  50. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: Imagenet: A large-scale hierarchical image database. In: Proceedings of Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

  51. Collobert, R., Bengio, S., Mariéthoz, J.: Torch: a modular machine learning software library. Tech. Rep, Idiap (2002)

    Google Scholar 

  52. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Proceedings of European conference on computer vision, pp. 630–645 (2016)

Download references

Acknowledgements

This work was supported in part by National key research and development program (International Technology Cooperation Project) (No. 2019YFE0130700). The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group no. RGP-264.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tinghuai Ma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, M., Ma, T., Tian, Q. et al. Aggregated squeeze-and-excitation transformations for densely connected convolutional networks. Vis Comput 38, 2661–2674 (2022). https://doi.org/10.1007/s00371-021-02144-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02144-z

Keywords

Navigation