Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

An, Feng-Ping; Liu, Jun-e; Bai, Lei

doi:10.1007/s00371-020-02033-x

Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

Original article
Published: 03 January 2021

Volume 38, pages 541–553, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

575 Accesses
11 Citations
Explore all metrics

Abstract

Traditional object recognition algorithms cannot meet the requirements of object recognition accuracy in the actual warehousing and logistics field. In recent years, the rapid development of the deep learning theory has provided a technical approach for solving the above problems, and a number of object recognition algorithms has been proposed based on deep learning, which have been promoted and applied. However, deep learning has the following problems in the application process of object recognition: First, the nonlinear modeling ability of the activation function in the deep learning model is poor; second, the deep learning model has a large number of repeated pooling operations during which information is lost. In view of these shortcomings, this paper proposes multiple-parameter exponential linear units with uniform and learnable parameter forms and introduces two learned parameters in the exponential linear unit (ELU), enabling it to represent piecewise linear and exponential nonlinear functions. Therefore, the ELU has good nonlinear modeling capabilities. At the same time, to improve the problem of losing information in the large number of repeated pooling operations, this paper proposes a new global convolutional neural network structure. This network structure makes full use of the local and global information of different layer feature maps in the network. It can reduce the problem of losing feature information in the large number of pooling operations. Based on the above ideas, this paper suggests an object recognition algorithm based on the optimized nonlinear activation function-global convolutional neural network. Experiments were carried out on the CIFAR100 dataset and the ImageNet dataset using the object recognition algorithm proposed in this paper. The results show that the object recognition method suggested in this paper not only has a better recognition accuracy than traditional machine learning and other deep learning models but also has a good stability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Article 12 June 2020

Convolutional neural networks: an overview and application in radiology

Article Open access 22 June 2018

Data availability statement

The data used to support the findings of this study are included within the paper.

References

Hu, H., Gu, J., Zhang, Z.: Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588–3597 (2018)
Chen, L., Wang, F., Wang, L.: Research on warehouse object detection algorithm based on fused DenseNet and SSD. In: Chinese Conference on Image and Graphics Technologies, pp. 602–611. Springer, Berlin (2019)
Google Scholar
Hoang, D.C., Stoyanov, T., Lilienthal, A.J.: Object-RPE: dense 3D reconstruction and pose estimation with convolutional neural networks for warehouse robots. arXiv preprint https//arxiv.org/abs/1908.08601 (2019)
Eudes, A., Marzat, J., Sanfourche, M.: Autonomous and safe inspection of an industrial warehouse by a multi-rotor MAV, pp. 221–235. Field and Service Robotics. Springer, Cham (2018)
Google Scholar
Wang, J., Chen, Y., Hao, S.: Deep learning for sensor-based activity recognition: a survey. Pattern Recogn. Lett. 115, 3–11 (2019)
Article Google Scholar
Asaoka, T., Nagata, K., Nishi, T.: Detection of object arrangement patterns using images for robot picking. ROBOMECH J. 5(1), 23–32 (2018)
Article Google Scholar
Csurka, G., Dance, C., Fan, L.: Visual categorization with bags of keypoints, pp. 1–2. Workshop on statistical learning in computer vision, ECCV (2004)
Google Scholar
Yang, J., Yu, K., Gong, Y.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on computer vision and pattern recognition, pp. 1794–1801 (2009)
Lazebnik, S., Schmid. C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Ning, J., Yang, J., Jiang, S.: Object tracking via dual linear structured SVM and explicit feature map. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4266–4274 (2016)
Al-Halah, Z., Stiefelhagen. R.: How to transfer? zero-shot object recognition via hierarchical transfer of semantic attributes. In: IEEE Winter Conference on Applications of Computer Vision, pp. 837–843 (2015)
Huang, Y., Zhu, F., Shao, L.: Color object recognition via cross-domain learning on RGB-D images. In: IEEE International Conference on Robotics and Automation, pp. 1672–1677 (2016)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)
Article Google Scholar
Gualdi, G., Prati, A., Cucchiara, R.: Multistage particle windows for fast and accurate object detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1589–1604 (2011)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Khagi, B., Kwon, G.R., Lama, R.: Comparative analysis of Alzheimer’s disease classification by CDR level using CNN, feature selection, and machine-learning techniques. Int. J. Imaging Syst. Technol. 29(3), 297–310 (2019)
Article Google Scholar
Wei, Y., Xia, W., Lin, M.: Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(9), 1901–1907 (2016)
Article Google Scholar
Bao, Y., Tang, Z., Li, H.: Computer vision and deep learning–based data anomaly detection method for structural health monitoring. Struct. Health Monit. 18(2), 401–421 (2019)
Article Google Scholar
Xiong, W., Wu, L., Alleva, F.: The Microsoft 2017 conversational speech recognition system. In: IEEE International Conference on Acoustics, pp. 5934–5938 (2018)
Mitra, V., Sivaraman, G., Nam, H.: Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition. Speech Commun. 89, 103–112 (2017)
Article Google Scholar
Ding, C., Tao, D.: Robust face recognition via multimodal deep face representation. IEEE Trans. Multimedia 17(11), 2049–2058 (2015)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint https//arxiv.org/abs/1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint https//arxiv.org/abs/1605.07146 (2016)
Gao, H., Cheng, B., Wang, J.: Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Trans. Ind. Inf. 14(9), 4224–4231 (2018)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015)
Article Google Scholar
Ren, S., He, K., Girshick, R.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Redmon, J., Divvala, S., Girshick, R.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)
Tian, L., Tang, C., Xu, M.: Accurate and efficient extraction of fringe orientation from the poor-quality ESPI fringe pattern with a convolutional neural network. Appl. Opt. 58(27), 7523–7530 (2019)
Article Google Scholar
Luo, Z., Mishra, A., Achkar, A.: Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6609–6617 (2017)
He, K., Zhang, X., Ren, S.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 and cifar-100 datasets, p. 6 (2009). https://www.cs.toronto.edu/kriz/cifar.html
Huang, G., Sun, Y., Liu, Z.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661 (2016)
Zhan, Y., Taylor, M.E.: Online transfer learning in reinforcement learning domains. In: AAAI Fall Symposium Series (2015)
Hoffman, J., Rodner, E., Donahue, J.: Asymmetric and category invariant feature transformations for domain adaptation. Int. J. Comput. Vision 109(1–2), 28–41 (2014)
Article MathSciNet Google Scholar
Li, H., Kadav, A., Durdanovic, I.: Pruning filters for efficient convnets. arXiv preprint https//arxiv.org/abs/1608.08710 (2016)
Liu, Z., Li, J., Shen, Z.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)
Zhao, L., Wang, J., Li, X.: On the connection of deep fusion to ensembling. arXiv preprint https//arxiv.org/abs/1611.07718 (2016).
Xie, G., Wang, J., Zhang, T.: Interleaved structured sparse convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8847–8856 (2018)
Huang, G., Liu, Z., Van Der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Sun, K., Li, M., Liu, D.: Igcv3: interleaved low-rank group convolutions for efficient deep neural networks. arXiv preprint https//arxiv.org/abs/1806.00178 (2018)
Deng, J., Dong, W., Socher, R.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Gong, Y., Lazebnik, S., Gordo, A.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2012)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural. In: Neural Information Processing Systems, pp. 1–9 (2014)
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint https//arxiv.org/abs/1312.4400 (2013)
Shen, F., Shen, C., Liu, W.: Supervised discrete hashing: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 37–45 (2015)
Liu, H., Wang, R., Shan, S.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072 (2016)
Cao, Z., Long, M., Wang, J.: Hashnet: deep learning to hash by continuation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5608–5617 (2017)
Yang, H.F., Lin, K., Chen, C.S.: Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 437–451 (2017)
Article Google Scholar
Jiang, Q.Y., Cui, X., Li, W.J.: Deep discrete supervised hashing. IEEE Trans. Image Process. 27(12), 5996–6009 (2018)
Article MathSciNet Google Scholar
Lin, T.-Y., Dollar, P., Girshick, R.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Lin, T.-Y., Dollar, P., Girshick, R., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 99, 2999–3007 (2017)
Google Scholar

Download references

Acknowledgements

This paper is supported by National Natural Science Foundation of China (No. 61701188), China Postdoctoral Science Foundation (No. 2019M650512), Beijing Intelligent Logistics System Collaborative Innovation Center (No. BILSCIC-2019KF-22), and Hebei IoT Monitoring Engineering Technology Research Center funded project (No. IOT202004).

Author information

Authors and Affiliations

School of Information and Electronics, Beijing Institute of Technology, Beijing, 100081, BJ, China
Feng-Ping An
School of Physics and Electronic Electrical Engineering, Huaiyin Normal University, Huaian, 223300, JS, China
Feng-Ping An
School of Information, Beijing Wuzi University, Beijing, 100081, BJ, China
Jun-e Liu
Hebei IoT Monitoring Engineering Technology Research Center, North China Institute of Science and Technology, Beijing, 101049, China
Lei Bai

Authors

Feng-Ping An
View author publications
You can also search for this author in PubMed Google Scholar
Jun-e Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Bai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Feng-Ping An or Lei Bai.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

An, FP., Liu, Je. & Bai, L. Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network. Vis Comput 38, 541–553 (2022). https://doi.org/10.1007/s00371-020-02033-x

Download citation

Accepted: 23 November 2020
Published: 03 January 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00371-020-02033-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Convolutional neural networks: an overview and application in radiology

Data availability statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Convolutional neural networks: an overview and application in radiology

Data availability statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation