Abstract
As a high performance method for various image processing tasks, deep convolutional neural networks (CNNs) have reached impressive performances and absorbed considerable attention in the last few years. However, object classification on small size datasets for which a limited number of training images is available is still considered as an open problem. In this paper, we investigate a new method to effectively extract semantic image features. The proposed method which is based on CNNs boosts the performance of the object classification problem on small size dataset. To this end, a new method using image segmentation and CNNs is investigated. Our main goal is to increase the classification accuracy by first detecting and then extracting the main object of images. As training CNNs on small datasets does not yield to high performances because of millions of parameters to be learned, we propose using transfer learning strategy. Consequently, we first determine the main object of an image, and then we extract it. The extracted main object is used to tune the weights of the CNN in the training process. In this study, we employ a CNN that has been trained on the ImageNet dataset to reach mid-level image representation. Our experiments on Caltech-101 object dataset have shown that the proposed method substantially defeats other state-of-the-art methods.
Similar content being viewed by others
REFERENCES
Giveki, D., Soltanshahi, M.A., and Montazer, G.A., A new image feature descriptor for content based image retrieval using scale invariant feature transform and local derivative pattern, Optik, 2017, vol. 131, pp. 242–254.
Giveki, D., Soltanshahi, M.A., and Yousefvand, M., Proposing a new feature descriptor for moving object detection, Optik, 2020, vol. 209, 164563.
Giveki, D., Scale-space multi-view bag of words for scene categorization, Multimedia Tools and Applications, 2020, pp. 1–23.
Jiang, X., Pang, Y., Li, X., Pan, J., and Xie, Y., Deep neural networks with elastic rectified linear units for object recognition, Neurocomputing, 2018, vol. 275, pp. 1132–1139.
Machine Learning and Signal Processing for Big Multimedia Analysis, Yu, J., Sang, J., and Gao, X., Eds., Elsevier B.V., 2017.
Aspiras, T.H. and Asari, V.K., Hierarchical autoassociative polynimial network (hap net) for pattern recognition, Neurocomputing, 2017, vol. 222, pp. 1–10.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., and Chen, T., Recent advances in convolutional neural networks. Pattern Recognit., 2018, vol. 77, pp. 354–377.
Jiang, X., Pang, Y., Li, X., and Pan, J., Speed up deep neural network based pedestrian detection by sharing features across multi-scale models, Neurocomputing, 2016, vol. 185, pp. 163–170.
Nian, F., Li, T., Wang, Y., Xu, M., and Wu, J., Pornographic image detection utilizing deep convolutional neural networks, Neurocomputing, 2016, vol. 210, pp. 283–293.
Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.
Srivastava, N. and Salakhutdinov, R.R., Discriminative transfer learning with tree-based priors, in Advances in Neural Information Processing Systems, 2013, pp. 2094–2102.
Wang, Z., Wang, X., and Wang, G., Learning fine-grained features via a CNN tree for large-scale classification, Neurocomputing, 2018, vol. 275, pp. 1231–1240.
Zheng, Y., Zhang, Y.J., and Larochelle, H., A deep and autoregressive approach for topic modeling of multimodal data, IEEE Trans. Pattern Anal. Mach. Intell., 2016, vol. 38, no. 6, pp. 1056–1069.
Zheng, Y., Zhang, Y.J., and Larochelle, H., Topic modeling of multimodal data: an autoregressive approach, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1370–1377.
Zhu, F., Ma, Z., Li, X., Chen, G., Chien, J.T., Xue, J.H., and Guo, J., Image-text dual neural network with decision strategy for small-sample image classification, Neurocomputing, 2019, vol. 328, pp. 182–188.
Fu, Y. and Aldrich, C., Froth image analysis by use of transfer learning and convolutional neural networks, Miner. Eng., 2018, vol. 115, pp. 68–78.
Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.
Sun, X., Shi, J., Liu, L., Dong, J., Plant, C., Wang, X., and Zhou, H., Transferring deep knowledge for object recognition in Low-quality underwater videos, Neurocomputing, 2018, vol. 275, pp. 897–908.
Deng, Y., Manjunath, B.S., Kenney, C., Moore, M.S., and Shin, H., An efficient color representation for image retrieval, IEEE Trans. Image Process., 2001, vol. 10, no. 1, pp. 140–147.
Montazer, G.A. and Giveki, D., Scene classification using multi-resolution WAHOLB features and neural network classifier, Neural Process. Lett., 2017, vol. 46, no. 2, pp. 681–704.
Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., an Summers, R.M., Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning, IEEE Trans. Med. Imaging, 2016, vol. 35, no. 5, pp. 1285–1298.
Fei-Fei, L., Fergus, R., and Perona, P., Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, Comput. Vision Image Understanding, 2007, vol. 106, no. 1, pp. 59–70.
Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., and Togneri, R., Cost-sensitive learning of deep feature representations from imbalanced data, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, no. 8, pp. 3573–3587.
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T., Decaf: A deep convolutional activation feature for generic visual recognition, in International Conference on Machine Learning, 2014, pp. 647–655.
Zeiler, M.D. and Fergus, R., Visualizing and understanding convolutional networks, in European Conference on Computer Vision, 2014, Cham.: Springer, pp. 818–833.
Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A., Return of the devil in the details: Delving deep into convolutional nets, 2014. arXiv preprint arXiv:1405.3531.
He, K., Zhang, X., Ren, S., and Sun, J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 2015, vol. 37, no. 9, pp. 1904–1916.
Li, Q., Peng, Q., and Yan, C., Multiple VLAD encoding of CNNs for image classification, Comput. Sci. Eng., 2018, vol. 20, no. 2, pp. 52–63.
Kaya, A., Keceli, A.S., Catal, C., Yalic, H.Y., Temucin, H., and Tekinerdogan, B., Analysis of transfer learning for deep neural network based plant classification models, Comput. Electron. Agric., 2019, vol. 158, pp. 20–29.
ACKNOWLEDGMENTS
The authors are grateful to the anonymous reviewers for the insightful comments and constructive suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
There is no conflict of interest.
About this article
Cite this article
Davar Giveki Improving the Performance of Convolutional Neural Networks for Image Classification. Opt. Mem. Neural Networks 30, 51–66 (2021). https://doi.org/10.3103/S1060992X21010100
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X21010100