Skip to main content
Log in

Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Traditional object recognition algorithms cannot meet the requirements of object recognition accuracy in the actual warehousing and logistics field. In recent years, the rapid development of the deep learning theory has provided a technical approach for solving the above problems, and a number of object recognition algorithms has been proposed based on deep learning, which have been promoted and applied. However, deep learning has the following problems in the application process of object recognition: First, the nonlinear modeling ability of the activation function in the deep learning model is poor; second, the deep learning model has a large number of repeated pooling operations during which information is lost. In view of these shortcomings, this paper proposes multiple-parameter exponential linear units with uniform and learnable parameter forms and introduces two learned parameters in the exponential linear unit (ELU), enabling it to represent piecewise linear and exponential nonlinear functions. Therefore, the ELU has good nonlinear modeling capabilities. At the same time, to improve the problem of losing information in the large number of repeated pooling operations, this paper proposes a new global convolutional neural network structure. This network structure makes full use of the local and global information of different layer feature maps in the network. It can reduce the problem of losing feature information in the large number of pooling operations. Based on the above ideas, this paper suggests an object recognition algorithm based on the optimized nonlinear activation function-global convolutional neural network. Experiments were carried out on the CIFAR100 dataset and the ImageNet dataset using the object recognition algorithm proposed in this paper. The results show that the object recognition method suggested in this paper not only has a better recognition accuracy than traditional machine learning and other deep learning models but also has a good stability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability statement

The data used to support the findings of this study are included within the paper.

References

  1. Hu, H., Gu, J., Zhang, Z.: Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588–3597 (2018)

  2. Chen, L., Wang, F., Wang, L.: Research on warehouse object detection algorithm based on fused DenseNet and SSD. In: Chinese Conference on Image and Graphics Technologies, pp. 602–611. Springer, Berlin (2019)

    Google Scholar 

  3. Hoang, D.C., Stoyanov, T., Lilienthal, A.J.: Object-RPE: dense 3D reconstruction and pose estimation with convolutional neural networks for warehouse robots. arXiv preprint https//arxiv.org/abs/1908.08601 (2019)

  4. Eudes, A., Marzat, J., Sanfourche, M.: Autonomous and safe inspection of an industrial warehouse by a multi-rotor MAV, pp. 221–235. Field and Service Robotics. Springer, Cham (2018)

    Google Scholar 

  5. Wang, J., Chen, Y., Hao, S.: Deep learning for sensor-based activity recognition: a survey. Pattern Recogn. Lett. 115, 3–11 (2019)

    Article  Google Scholar 

  6. Asaoka, T., Nagata, K., Nishi, T.: Detection of object arrangement patterns using images for robot picking. ROBOMECH J. 5(1), 23–32 (2018)

    Article  Google Scholar 

  7. Csurka, G., Dance, C., Fan, L.: Visual categorization with bags of keypoints, pp. 1–2. Workshop on statistical learning in computer vision, ECCV (2004)

    Google Scholar 

  8. Yang, J., Yu, K., Gong, Y.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on computer vision and pattern recognition, pp. 1794–1801 (2009)

  9. Lazebnik, S., Schmid. C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

  10. Ning, J., Yang, J., Jiang, S.: Object tracking via dual linear structured SVM and explicit feature map. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4266–4274 (2016)

  11. Al-Halah, Z., Stiefelhagen. R.: How to transfer? zero-shot object recognition via hierarchical transfer of semantic attributes. In: IEEE Winter Conference on Applications of Computer Vision, pp. 837–843 (2015)

  12. Huang, Y., Zhu, F., Shao, L.: Color object recognition via cross-domain learning on RGB-D images. In: IEEE International Conference on Robotics and Automation, pp. 1672–1677 (2016)

  13. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  14. Gualdi, G., Prati, A., Cucchiara, R.: Multistage particle windows for fast and accurate object detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1589–1604 (2011)

    Article  Google Scholar 

  15. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  16. Khagi, B., Kwon, G.R., Lama, R.: Comparative analysis of Alzheimer’s disease classification by CDR level using CNN, feature selection, and machine-learning techniques. Int. J. Imaging Syst. Technol. 29(3), 297–310 (2019)

    Article  Google Scholar 

  17. Wei, Y., Xia, W., Lin, M.: Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(9), 1901–1907 (2016)

    Article  Google Scholar 

  18. Bao, Y., Tang, Z., Li, H.: Computer vision and deep learning–based data anomaly detection method for structural health monitoring. Struct. Health Monit. 18(2), 401–421 (2019)

    Article  Google Scholar 

  19. Xiong, W., Wu, L., Alleva, F.: The Microsoft 2017 conversational speech recognition system. In: IEEE International Conference on Acoustics, pp. 5934–5938 (2018)

  20. Mitra, V., Sivaraman, G., Nam, H.: Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition. Speech Commun. 89, 103–112 (2017)

    Article  Google Scholar 

  21. Ding, C., Tao, D.: Robust face recognition via multimodal deep face representation. IEEE Trans. Multimedia 17(11), 2049–2058 (2015)

    Article  Google Scholar 

  22. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  23. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint https//arxiv.org/abs/1409.1556 (2014)

  24. Szegedy, C., Liu, W., Jia, Y.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  25. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  26. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint https//arxiv.org/abs/1605.07146 (2016)

  27. Gao, H., Cheng, B., Wang, J.: Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Trans. Ind. Inf. 14(9), 4224–4231 (2018)

    Article  Google Scholar 

  28. Girshick, R., Donahue, J., Darrell, T.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015)

    Article  Google Scholar 

  29. Ren, S., He, K., Girshick, R.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  30. Redmon, J., Divvala, S., Girshick, R.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  31. Liu, W., Anguelov, D., Erhan, D.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)

  32. Tian, L., Tang, C., Xu, M.: Accurate and efficient extraction of fringe orientation from the poor-quality ESPI fringe pattern with a convolutional neural network. Appl. Opt. 58(27), 7523–7530 (2019)

    Article  Google Scholar 

  33. Luo, Z., Mishra, A., Achkar, A.: Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6609–6617 (2017)

  34. He, K., Zhang, X., Ren, S.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  35. Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 and cifar-100 datasets, p. 6 (2009). https://www.cs.toronto.edu/kriz/cifar.html

  36. Huang, G., Sun, Y., Liu, Z.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661 (2016)

  37. Zhan, Y., Taylor, M.E.: Online transfer learning in reinforcement learning domains. In: AAAI Fall Symposium Series (2015)

  38. Hoffman, J., Rodner, E., Donahue, J.: Asymmetric and category invariant feature transformations for domain adaptation. Int. J. Comput. Vision 109(1–2), 28–41 (2014)

    Article  MathSciNet  Google Scholar 

  39. Li, H., Kadav, A., Durdanovic, I.: Pruning filters for efficient convnets. arXiv preprint https//arxiv.org/abs/1608.08710 (2016)

  40. Liu, Z., Li, J., Shen, Z.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)

  41. Zhao, L., Wang, J., Li, X.: On the connection of deep fusion to ensembling. arXiv preprint https//arxiv.org/abs/1611.07718 (2016).

  42. Xie, G., Wang, J., Zhang, T.: Interleaved structured sparse convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8847–8856 (2018)

  43. Huang, G., Liu, Z., Van Der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

  44. Sun, K., Li, M., Liu, D.: Igcv3: interleaved low-rank group convolutions for efficient deep neural networks. arXiv preprint https//arxiv.org/abs/1806.00178 (2018)

  45. Deng, J., Dong, W., Socher, R.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

  46. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  47. Gong, Y., Lazebnik, S., Gordo, A.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2012)

    Article  Google Scholar 

  48. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural. In: Neural Information Processing Systems, pp. 1–9 (2014)

  49. Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint https//arxiv.org/abs/1312.4400 (2013)

  50. Shen, F., Shen, C., Liu, W.: Supervised discrete hashing: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 37–45 (2015)

  51. Liu, H., Wang, R., Shan, S.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072 (2016)

  52. Cao, Z., Long, M., Wang, J.: Hashnet: deep learning to hash by continuation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5608–5617 (2017)

  53. Yang, H.F., Lin, K., Chen, C.S.: Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 437–451 (2017)

    Article  Google Scholar 

  54. Jiang, Q.Y., Cui, X., Li, W.J.: Deep discrete supervised hashing. IEEE Trans. Image Process. 27(12), 5996–6009 (2018)

    Article  MathSciNet  Google Scholar 

  55. Lin, T.-Y., Dollar, P., Girshick, R.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

  56. Lin, T.-Y., Dollar, P., Girshick, R., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 99, 2999–3007 (2017)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by National Natural Science Foundation of China (No. 61701188), China Postdoctoral Science Foundation (No. 2019M650512), Beijing Intelligent Logistics System Collaborative Innovation Center (No. BILSCIC-2019KF-22), and Hebei IoT Monitoring Engineering Technology Research Center funded project (No. IOT202004).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Feng-Ping An or Lei Bai.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

An, FP., Liu, Je. & Bai, L. Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network. Vis Comput 38, 541–553 (2022). https://doi.org/10.1007/s00371-020-02033-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-020-02033-x

Keywords

Navigation