Skip to main content
Log in

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Image classification has received extensive attention as an important technical means of acquiring image information. It has been widely used in various engineering fields. Although the existing traditional image classification methods have been widely applied in practical problems, there are some problems in the application process, such as unsatisfactory effects, low classification accuracy and weak adaptive ability. This is because this type of method relies on the designer’s prior knowledge and cognitive understanding of the classification task. At the same time, this method separates image feature extraction and classification into two steps for classification operation. However, the deep learning model has a powerful learning ability, which integrates the feature extraction and classification process into a whole to complete the image classification test, which can effectively improve the image classification accuracy. At the same time, the image classification method based on deep learning also has the following problems in the application process: First, it is impossible to effectively approximate the complex functions in the deep learning model. Second, the deep learning model comes with a low classifier with low accuracy. To this end, this paper introduces the idea of sparse representation into the architecture of deep learning network, comprehensively utilizes the sparse representation of good multidimensional data linear decomposition ability and the deep structural advantages of multi-layer nonlinear mapping to complete the complex function approximation in deep learning model. It constructs a deep learning model with adaptive approximation ability, which solves the function approximation problem of deep learning models. At the same time, in order to further improve the classification effect of the deep learning classifier, a sparse representation classification method based on the optimized kernel function is proposed to replace the classifier in the deep learning model, thereby improving the image classification effect. Based on the above explanation, this paper proposes an image classification algorithm based on the stacked sparse coding depth learning model-optimized kernel function nonnegative sparse representation. The experimental results show that the proposed method not only has a higher average accuracy than other mainstream methods, but also can be well adapted to various image databases. This is because the proposed method can extract more image feature information than the traditional image classification method and can better adaptively match the image information. Compared with other deep learning methods, it can better solve the problems of complex function approximation and poor classifier effect, thus further improving image classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The data used to support the findings of this study are included within the paper.

References

  • Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 12:2037–2041

    MATH  Google Scholar 

  • Alom M Z, Taha T M, Yakopcic C (2018) The history began from AlexNet: a comprehensive survey on deep learning approaches. arXiv preprint arXiv:1803.01164

  • Bentes C, Velotto D, Lehner S (2015) Target classification in oceanographic SAR images with deep neural networks: Architecture and initial results. In: IEEE International geoscience and remote sensing symposium, pp 3703–3706

  • Berthod M, Kato Z, Yu S (1996) Bayesian image classification using Markov random fields. Image Vis Comput 14(4):285–295

    Google Scholar 

  • Cheng R, Zhang J, Yang P (2017) CNet: Context-Aware Network for Semantic Segmentation. In: IEEE conference on 4th IAPR Asian conference on pattern recognition, pp 67–72

  • Chéron G, Laptev I, Schmid C (2015) P-cnn: Pose-based cnn features for action recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3218–3226

  • Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745

  • Clark K, Vendt B, Smith K (2013) The cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imaging 26(6):1045–1057

    Google Scholar 

  • Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the 14th international conference on artificial intelligence and statistics, pp 215–223

  • Deng J, Dong W, Socher R (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255

  • Ding C, Tao D (2015) Robust face recognition via multimodal deep face representation. IEEE Trans Multimedia 17(11):2049–2058

    Google Scholar 

  • Ding J, Chen B, Liu H (2016) Convolutional neural network with data augmentation for SAR target recognition. IEEE Geosci Remote Sens Lett 13(3):364–368

    Google Scholar 

  • Dubey SR, Singh SK, Singh RK (2015) Local wavelet pattern: a new feature descriptor for image retrieval in medical CT databases. IEEE Trans Image Process 24(12):5892–5903

    MathSciNet  MATH  Google Scholar 

  • Esteva A, Kuprel B, Novoa RA (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–122

    Google Scholar 

  • Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941

  • Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440-1448

  • Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset

  • Han X, Zhong Y, Zhao B (2017) Scene classification based on a hierarchical convolutional sparse auto-encoder for high spatial resolution imagery. Int J Remote Sens 38(2):514–536

    Google Scholar 

  • Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    MathSciNet  MATH  Google Scholar 

  • Huang J, Kumar S R, Mitra M (1997) Image indexing using color correlograms. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 762–768

  • Kang K, Li H, Yan J (2018) T-cnn: tubelets with convolutional neural networks for object detection from videos. IEEE Trans Circuits Syst Video Technol 28(10):2896–2907

    Google Scholar 

  • Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3128–3137

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • Larochelle H, Bengio Y (2008) Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th international ACM conference on machine learning, pp 536–543

  • Lee H, Kwon H (2017) Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans Image Process 26(10):4843–4855

    MathSciNet  Google Scholar 

  • Li J, Najmi A, Gray RM (2000) Image classification by a two-dimensional hidden Markov model. IEEE Trans Signal Process 48(2):517–533

    Google Scholar 

  • Lin TY, Dollár P, Girshick RB (2017) Feature Pyramid Networks for Object Detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  • Lin TY, Goyal P, Girshick R (2018) Focal loss for dense object detection. In: IEEE Transactions on pattern analysis and machine intelligence

  • Loncomilla P, Ruiz-del-Solar J, Martínez L (2016) Object recognition using local invariant features for robotic applications: a survey. Pattern Recogn 60:499–514

    Google Scholar 

  • Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE international conference on computer vision, pp 1150–1157

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110

    Google Scholar 

  • Marcus DS, Wang TH, Parker J (2007) Open access series of imaging studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 19(9):1498–1507

    Google Scholar 

  • Mitra V, Sivaraman G, Nam H (2017) Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition. Speech Commun 89:103–112

    Google Scholar 

  • Moser G, Serpico SB (2013) Combining support vector machines and Markov random fields in an integrated framework for contextual image classification. IEEE Trans Geosci Remote Sens 51(5):2734–2752

    Google Scholar 

  • Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302

  • Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J Optim 22(2):341–362

    MathSciNet  MATH  Google Scholar 

  • Nie L, Kumar A, Zhan S (2014) Periocular recognition using unsupervised convolutional RBM feature learning. In: 2014 22nd International IEEE conference on pattern recognition, pp 399–404

  • Nogueira RF, de Alencar Lotufo R, Machado RC (2016) Fingerprint liveness detection using convolutional neural networks. IEEE Trans Inf Forens Secur 11(6):1206–1213

    Google Scholar 

  • Ojala T, Pietikainen M, Harwood D (1994) Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of the 12th IAPR International IEEE Conference on computer vision & image processing, pp 582–585

  • Parkhi O M, Vedaldi A, Zisserman A (2015) Deep face recognition. In: Proceedings of BMVC, pp 6–13

  • Ren S, He K, Girshick R (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149

    Google Scholar 

  • Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections. In: European conference on computer vision, pp 84–99

  • Sanjay-Gopal S, Hebert TJ (1998) Bayesian pixel classification using spatially variant finite mixtures and the generalized EM algorithm. IEEE Trans Image Process 7(7):1014–1028

    Google Scholar 

  • Sankaran A, Goswami G, Vatsa M (2017) Class sparsity signature based restricted Boltzmann machine. Pattern Recogn 61:674–685

    MATH  Google Scholar 

  • Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  • Sermanet P, Eigen D, Zhang X (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks, pp 1312–1320

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  • Smolensky P (1986) Information processing in dynamical systems: Foundations of harmony theory. In: Colorado Univ at Boulder Dept of Computer Science, 1986

  • Spanhol FA, Oliveira LS, Petitjean C (2016) A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng 63(7):1455–1462

    Google Scholar 

  • Sun L, Wu Z, Liu J (2015) Supervised spectral–spatial hyperspectral image classification with weighted Markov random fields. IEEE Trans Geosci Remote Sens 53(3):1490–1503

    Google Scholar 

  • Sun W, Shao S, Zhao R (2016) A sparse auto-encoder-based deep neural network approach for induction motor faults classification. Measurement 89:171–178

    Google Scholar 

  • Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vision 7(1):11–32

    Google Scholar 

  • Szegedy C, Liu W, Jia Y (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  • Tang P, Wang H, Kwong S (2017) G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing 225:188–197

    Google Scholar 

  • Tran J, Ufkes A, Fiala M (2011) Low-cost 3D scene reconstruction for response robots in real-time. In: IEEE International Symposium on Safety, Security, and Rescue Robotics, pp 161–166

  • VanderPlas J, Connolly A (2009) Reducing the dimensionality of data: locally linear embedding of sloan galaxy spectra. Astron J 138(5):1365–1375

    Google Scholar 

  • Vetrivel A, Gerke M, Kerle N (2018) Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning. ISPRS J Photogram Remote Sens 140:45–59

    Google Scholar 

  • Wang X, Han T X, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: IEEE 12th International conference on computer vision, pp 32–39

  • Wang L, Ouyang W, Wang X (2016) Stct: Sequentially training convolutional networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1373–1381

  • Wang FB, Tu P, Wu C (2018) Multi-image mosaic with SIFT and vision measurement for microscale structures processed by femtosecond laser. Opt Lasers Eng 100:124–130

    Google Scholar 

  • Wei Y, Xia W, Lin M (2016) Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907

    Google Scholar 

  • Xiao T, Xu Y, Yang K (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 842–850

  • Xiao T, Li H, Ouyang W (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258

  • Xiong W, Wu L, Alleva F (2018) The Microsoft 2017 conversational speech recognition system. In: 2018 IEEE international conference on acoustics, speech and signal processing, pp 5934–5938

  • Yan F, Mei W, Chunqin Z (2009) SAR image target recognition based on Hu invariant moments and SVM. In: IEEE conference on information assurance and security, pp 585–588

  • Yang L, Luo P, Change Loy C (2015) A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3973–3981

  • Yuan C, Li X, Wu QMJ (2017) Fingerprint liveness detection from different fingerprint materials using convolutional neural network and principal component analysis. Comput Mater Cont 53(3):357–371

    Google Scholar 

  • Zhang C, Liu J, Tian Q (2011) Image classification by non-negative sparse coding, low-rank and sparse decomposition. In: IEEE conference on computer vision and pattern recognition, pp 1673–1680

  • Zhang C, Pan X, Li H (2018) A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. ISPRS J Photogram Remote Sens 140:133–143

    Google Scholar 

  • Zhao W, Du S (2016) Spectral–spatial feature extraction for hyperspectral image classification: a dimension reduction and deep learning approach. IEEE Trans Geosci Remote Sens 54(8):4544–4554

    Google Scholar 

Download references

Acknowledgement

This paper is supported by National Natural Science Foundation of China (No. 61701188).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fengping An.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

An, F. Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation. Soft Comput 24, 16967–16981 (2020). https://doi.org/10.1007/s00500-020-04989-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04989-3

Keywords

Navigation