Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

An, Fengping

doi:10.1007/s00500-020-04989-3

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

Methodologies and Application
Published: 01 May 2020

Volume 24, pages 16967–16981, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Fengping An¹

425 Accesses
12 Citations
Explore all metrics

Abstract

Image classification has received extensive attention as an important technical means of acquiring image information. It has been widely used in various engineering fields. Although the existing traditional image classification methods have been widely applied in practical problems, there are some problems in the application process, such as unsatisfactory effects, low classification accuracy and weak adaptive ability. This is because this type of method relies on the designer’s prior knowledge and cognitive understanding of the classification task. At the same time, this method separates image feature extraction and classification into two steps for classification operation. However, the deep learning model has a powerful learning ability, which integrates the feature extraction and classification process into a whole to complete the image classification test, which can effectively improve the image classification accuracy. At the same time, the image classification method based on deep learning also has the following problems in the application process: First, it is impossible to effectively approximate the complex functions in the deep learning model. Second, the deep learning model comes with a low classifier with low accuracy. To this end, this paper introduces the idea of sparse representation into the architecture of deep learning network, comprehensively utilizes the sparse representation of good multidimensional data linear decomposition ability and the deep structural advantages of multi-layer nonlinear mapping to complete the complex function approximation in deep learning model. It constructs a deep learning model with adaptive approximation ability, which solves the function approximation problem of deep learning models. At the same time, in order to further improve the classification effect of the deep learning classifier, a sparse representation classification method based on the optimized kernel function is proposed to replace the classifier in the deep learning model, thereby improving the image classification effect. Based on the above explanation, this paper proposes an image classification algorithm based on the stacked sparse coding depth learning model-optimized kernel function nonnegative sparse representation. The experimental results show that the proposed method not only has a higher average accuracy than other mainstream methods, but also can be well adapted to various image databases. This is because the proposed method can extract more image feature information than the traditional image classification method and can better adaptively match the image information. Compared with other deep learning methods, it can better solve the problems of complex function approximation and poor classifier effect, thus further improving image classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

Ademola E. Ilesanmi & Taiwo O. Ilesanmi

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Jiayi Ma, Xingyu Jiang, … Junchi Yan

A survey of the recent architectures of deep convolutional neural networks

Article 21 April 2020

Asifullah Khan, Anabia Sohail, … Aqsa Saeed Qureshi

Data Availability

The data used to support the findings of this study are included within the paper.

References

Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 12:2037–2041
MATH Google Scholar
Alom M Z, Taha T M, Yakopcic C (2018) The history began from AlexNet: a comprehensive survey on deep learning approaches. arXiv preprint arXiv:1803.01164
Bentes C, Velotto D, Lehner S (2015) Target classification in oceanographic SAR images with deep neural networks: Architecture and initial results. In: IEEE International geoscience and remote sensing symposium, pp 3703–3706
Berthod M, Kato Z, Yu S (1996) Bayesian image classification using Markov random fields. Image Vis Comput 14(4):285–295
Google Scholar
Cheng R, Zhang J, Yang P (2017) CNet: Context-Aware Network for Semantic Segmentation. In: IEEE conference on 4th IAPR Asian conference on pattern recognition, pp 67–72
Chéron G, Laptev I, Schmid C (2015) P-cnn: Pose-based cnn features for action recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3218–3226
Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745
Clark K, Vendt B, Smith K (2013) The cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imaging 26(6):1045–1057
Google Scholar
Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the 14th international conference on artificial intelligence and statistics, pp 215–223
Deng J, Dong W, Socher R (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255
Ding C, Tao D (2015) Robust face recognition via multimodal deep face representation. IEEE Trans Multimedia 17(11):2049–2058
Google Scholar
Ding J, Chen B, Liu H (2016) Convolutional neural network with data augmentation for SAR target recognition. IEEE Geosci Remote Sens Lett 13(3):364–368
Google Scholar
Dubey SR, Singh SK, Singh RK (2015) Local wavelet pattern: a new feature descriptor for image retrieval in medical CT databases. IEEE Trans Image Process 24(12):5892–5903
MathSciNet MATH Google Scholar
Esteva A, Kuprel B, Novoa RA (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–122
Google Scholar
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440-1448
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset
Han X, Zhong Y, Zhao B (2017) Scene classification based on a hierarchical convolutional sparse auto-encoder for high spatial resolution imagery. Int J Remote Sens 38(2):514–536
Google Scholar
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
MathSciNet MATH Google Scholar
Huang J, Kumar S R, Mitra M (1997) Image indexing using color correlograms. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 762–768
Kang K, Li H, Yan J (2018) T-cnn: tubelets with convolutional neural networks for object detection from videos. IEEE Trans Circuits Syst Video Technol 28(10):2896–2907
Google Scholar
Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3128–3137
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Larochelle H, Bengio Y (2008) Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th international ACM conference on machine learning, pp 536–543
Lee H, Kwon H (2017) Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans Image Process 26(10):4843–4855
MathSciNet Google Scholar
Li J, Najmi A, Gray RM (2000) Image classification by a two-dimensional hidden Markov model. IEEE Trans Signal Process 48(2):517–533
Google Scholar
Lin TY, Dollár P, Girshick RB (2017) Feature Pyramid Networks for Object Detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin TY, Goyal P, Girshick R (2018) Focal loss for dense object detection. In: IEEE Transactions on pattern analysis and machine intelligence
Loncomilla P, Ruiz-del-Solar J, Martínez L (2016) Object recognition using local invariant features for robotic applications: a survey. Pattern Recogn 60:499–514
Google Scholar
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE international conference on computer vision, pp 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Google Scholar
Marcus DS, Wang TH, Parker J (2007) Open access series of imaging studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 19(9):1498–1507
Google Scholar
Mitra V, Sivaraman G, Nam H (2017) Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition. Speech Commun 89:103–112
Google Scholar
Moser G, Serpico SB (2013) Combining support vector machines and Markov random fields in an integrated framework for contextual image classification. IEEE Trans Geosci Remote Sens 51(5):2734–2752
Google Scholar
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J Optim 22(2):341–362
MathSciNet MATH Google Scholar
Nie L, Kumar A, Zhan S (2014) Periocular recognition using unsupervised convolutional RBM feature learning. In: 2014 22nd International IEEE conference on pattern recognition, pp 399–404
Nogueira RF, de Alencar Lotufo R, Machado RC (2016) Fingerprint liveness detection using convolutional neural networks. IEEE Trans Inf Forens Secur 11(6):1206–1213
Google Scholar
Ojala T, Pietikainen M, Harwood D (1994) Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of the 12th IAPR International IEEE Conference on computer vision & image processing, pp 582–585
Parkhi O M, Vedaldi A, Zisserman A (2015) Deep face recognition. In: Proceedings of BMVC, pp 6–13
Ren S, He K, Girshick R (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149
Google Scholar
Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections. In: European conference on computer vision, pp 84–99
Sanjay-Gopal S, Hebert TJ (1998) Bayesian pixel classification using spatially variant finite mixtures and the generalized EM algorithm. IEEE Trans Image Process 7(7):1014–1028
Google Scholar
Sankaran A, Goswami G, Vatsa M (2017) Class sparsity signature based restricted Boltzmann machine. Pattern Recogn 61:674–685
MATH Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Sermanet P, Eigen D, Zhang X (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks, pp 1312–1320
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Smolensky P (1986) Information processing in dynamical systems: Foundations of harmony theory. In: Colorado Univ at Boulder Dept of Computer Science, 1986
Spanhol FA, Oliveira LS, Petitjean C (2016) A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng 63(7):1455–1462
Google Scholar
Sun L, Wu Z, Liu J (2015) Supervised spectral–spatial hyperspectral image classification with weighted Markov random fields. IEEE Trans Geosci Remote Sens 53(3):1490–1503
Google Scholar
Sun W, Shao S, Zhao R (2016) A sparse auto-encoder-based deep neural network approach for induction motor faults classification. Measurement 89:171–178
Google Scholar
Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vision 7(1):11–32
Google Scholar
Szegedy C, Liu W, Jia Y (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tang P, Wang H, Kwong S (2017) G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing 225:188–197
Google Scholar
Tran J, Ufkes A, Fiala M (2011) Low-cost 3D scene reconstruction for response robots in real-time. In: IEEE International Symposium on Safety, Security, and Rescue Robotics, pp 161–166
VanderPlas J, Connolly A (2009) Reducing the dimensionality of data: locally linear embedding of sloan galaxy spectra. Astron J 138(5):1365–1375
Google Scholar
Vetrivel A, Gerke M, Kerle N (2018) Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning. ISPRS J Photogram Remote Sens 140:45–59
Google Scholar
Wang X, Han T X, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: IEEE 12th International conference on computer vision, pp 32–39
Wang L, Ouyang W, Wang X (2016) Stct: Sequentially training convolutional networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1373–1381
Wang FB, Tu P, Wu C (2018) Multi-image mosaic with SIFT and vision measurement for microscale structures processed by femtosecond laser. Opt Lasers Eng 100:124–130
Google Scholar
Wei Y, Xia W, Lin M (2016) Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907
Google Scholar
Xiao T, Xu Y, Yang K (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 842–850
Xiao T, Li H, Ouyang W (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258
Xiong W, Wu L, Alleva F (2018) The Microsoft 2017 conversational speech recognition system. In: 2018 IEEE international conference on acoustics, speech and signal processing, pp 5934–5938
Yan F, Mei W, Chunqin Z (2009) SAR image target recognition based on Hu invariant moments and SVM. In: IEEE conference on information assurance and security, pp 585–588
Yang L, Luo P, Change Loy C (2015) A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3973–3981
Yuan C, Li X, Wu QMJ (2017) Fingerprint liveness detection from different fingerprint materials using convolutional neural network and principal component analysis. Comput Mater Cont 53(3):357–371
Google Scholar
Zhang C, Liu J, Tian Q (2011) Image classification by non-negative sparse coding, low-rank and sparse decomposition. In: IEEE conference on computer vision and pattern recognition, pp 1673–1680
Zhang C, Pan X, Li H (2018) A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. ISPRS J Photogram Remote Sens 140:133–143
Google Scholar
Zhao W, Du S (2016) Spectral–spatial feature extraction for hyperspectral image classification: a dimension reduction and deep learning approach. IEEE Trans Geosci Remote Sens 54(8):4544–4554
Google Scholar

Download references

Acknowledgement

This paper is supported by National Natural Science Foundation of China (No. 61701188).

Author information

Authors and Affiliations

School of Physics and Electronic Electrical Engineering, Huaiyin Normal University, Huaian, 223300, China
Fengping An

Authors

Fengping An
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fengping An.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

An, F. Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation. Soft Comput 24, 16967–16981 (2020). https://doi.org/10.1007/s00500-020-04989-3

Download citation

Published: 01 May 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s00500-020-04989-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

Image Matching from Handcrafted to Deep Features: A Survey

A survey of the recent architectures of deep convolutional neural networks

Data Availability

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

Image Matching from Handcrafted to Deep Features: A Survey

A survey of the recent architectures of deep convolutional neural networks

Data Availability

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation