Skip to main content
Log in

Few-shot imbalanced classification based on data augmentation

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Few-shot imbalanced classification tasks are commonly faced in the real-world applications due to the unbalanced data distribution and few samples of rare classes. As known, the traditional machine learning algorithms perform poorly on the imbalanced classification, usually ignoring the few samples in the minority class to achieve a good overall accuracy. To solve this few-shot problem, a novel data augmentation method was proposed in this study, called H-SMOTE, to rebalance the original imbalanced data in a stable and reasonable way. Extensive experiments were carried out on 12 open datasets covering a wide range of imbalance rate from 3.8 to 16.4. Moreover, two typical classifiers SVM and Random Forest were selected to testify the performance and generalization of proposed H-SMOTE. Further, the typical data oversampling algorithm SMOTE was adopted as the baseline of comparison. The average experimental results show that the proposed H-SMOTE method outperforms the typical SMOTE in terms of accuracy (2.58%), recall (0.67%), F-measure (2.33%), G-mean (2.58%), and AUC (2.5%). Besides, the distribution of augmented dataset by H-SMOTE is more uniform and stable. Thus, this work provides a useful data augmentation method to solve the few-shot imbalanced classification, which can also be generalized to many areas in multimedia systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Li, Y., Yang, J.: Few-shot cotton pest recognition and terminal realization. Comput Electron Agric 169, 105240 (2020)

    Article  Google Scholar 

  2. Buda, M., Maki, A., Mazurowski, M.A.: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106, 249–259 (2018)

    Article  Google Scholar 

  3. Haixiang, G., Yijing, L., Shang, J., et al.: Learning from class-imbalanced data: Review of methods and applications. Expert Syst Appl 73, 220–239 (2017)

    Article  Google Scholar 

  4. Kumar G, Thakur K, Ayyagari M R. MLEsIDSs: machine learning-based ensembles for intrusion detection systems—a review. J Supercomput. 2020: 1–34.

  5. Xi, P.P., Zhao, Y.P., Wang, P.X., et al.: Least squares support vector machine for class imbalance learning and their applications to fault detection of aircraft engine. Aerosp Sci Technol 84, 56–74 (2019)

    Article  Google Scholar 

  6. Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.A., et al.: Scarff: a scalable framework for streaming credit card fraud detection with spark. Information Fusion 41, 182–194 (2018)

    Article  Google Scholar 

  7. Sheng, X., Li, Y., Lian, M., et al.: Influence of coupling interference on arrayed eddy current displacement measurement. Mater Eval 74(12), 1675–1683 (2016)

    Google Scholar 

  8. Li, Y., Chao, X.: ANN-based continual classification in agriculture. Agriculture 10(5), 178 (2020)

    Article  Google Scholar 

  9. Liang X W, Jiang A P, Li T, et al. LR-SMOTE–An improved unbalanced data set oversampling based on K-means and SVM. Knowledge-Based Systems, 2020: 105845.

  10. Tsai, C.F., Lin, W.C., Hu, Y.H., et al.: Under-sampling class imbalanced datasets by combining clustering analysis and instance selection. Inf Sci 477, 47–54 (2019)

    Article  Google Scholar 

  11. Lin, W.C., Tsai, C.F., Hu, Y.H., et al.: Clustering-based undersampling in class-imbalanced data. Inf Sci 409, 17–26 (2017)

    Article  Google Scholar 

  12. Douzas, G., Bacao, F.: Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning. Expert Syst Appl 82, 40–52 (2017)

    Article  Google Scholar 

  13. Gan D, Shen J, An B, et al. Integrating TANBN with cost sensitive classification algorithm for imbalanced data in medical diagnosis. Comput Industrial Eng. 2020: 106266.

  14. Fan, Q., Wang, Z., Li, D., et al.: Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl Based Syst 115, 87–99 (2017)

    Article  Google Scholar 

  15. Tang, B., He, H.: GIR-based ensemble sampling approaches for imbalanced learning. Pattern Recogn 71, 306–319 (2017)

    Article  Google Scholar 

  16. Aurelio, Y.S., de Almeida, G.M., de Castro, C.L., et al.: Learning from imbalanced data sets with weighted cross-entropy function[J]. Neural Process Lett 50(2), 1937–1949 (2019)

    Article  Google Scholar 

  17. Li M, Xiong A, Wang L, et al. Aco Resampling: Enhancing the performance of oversampling methods for class imbalance classification. Knowledge-Based Systems, 2020: 105818.

  18. Koziarski, M., Krawczyk, B., Woźniak, M.: Radial-Based oversampling for noisy imbalanced data classification. Neurocomputing 343, 19–33 (2019)

    Article  Google Scholar 

  19. Zhu, T., Lin, Y., Liu, Y., et al.: Minority oversampling for imbalanced ordinal regression. Knowl Based Syst 166, 140–155 (2019)

    Article  Google Scholar 

  20. Elreedy, D., Atiya, A.F.: A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf Sci 505, 32–64 (2019)

    Article  Google Scholar 

  21. Yang J, Zhao Y, Liu J, et al. No Reference Quality Assessment for Screen Content Images Using Stacked Autoencoders in Pictorial and Textual Regions. IEEE Transactions on Cybernetics, 2020.

  22. Yang, J., Wang, C., Jiang, B., et al.: Visual perception enabled industry intelligence: state of the art, challenges and prospects. IEEE Trans Industr Inf 17(3), 2204–2219 (2020)

    Article  Google Scholar 

  23. Yang, J., Wen, J., Wang, Y., et al.: Fog-based marine environmental information monitoring toward ocean of things. IEEE Internet Things J 7(5), 4238–4247 (2019)

    Article  Google Scholar 

  24. Yang, J., Wen, J., Jiang, B., et al.: Blockchain-based sharing and tamper-proof framework of big data networking. IEEE Network 34(4), 62–67 (2020)

    Article  Google Scholar 

  25. Shen, H., Lin, D., Song, T., et al.: Anti-distractors: two-branch siamese tracker with both static and dynamic filters for object tracking. Multimedia Syst 26(6), 631–641 (2020)

    Article  Google Scholar 

  26. Fang, M., Bai, X., Zhao, J., et al.: Integrating Gaussian mixture model and dilated residual network for action recognition in videos. Multimedia Syst 26(6), 715–725 (2020)

    Article  Google Scholar 

  27. Li Y, Yang J. Meta-learning baselines and database for few-shot classification in agriculture[J]. Computers and Electronics in Agriculture, 2021, 182: 106055.

  28. Peng Z, Li Z, Zhang J, et al. Few-shot image recognition with knowledge transfer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 441–449.

  29. Sung F, Yang Y, Zhang L, et al. Learning to compare: Relation network for few-shot learning[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 1199–1208.

  30. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//International Conference on Machine Learning. PMLR, 2017: 1126–1135.

  31. Li, Y., Nie, J., Chao, X.: Do we really need deep CNN for plant diseases identification? Comput Electron Agriculture 178, 105803 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Major Science and Technology Program of Xinjiang Production and Construction Corps (grant number 2021AA006) and Natural Science Program of Shihezi University (Grant Number KX01230101).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lixin Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chao, X., Zhang, L. Few-shot imbalanced classification based on data augmentation. Multimedia Systems 29, 2843–2851 (2023). https://doi.org/10.1007/s00530-021-00827-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-021-00827-0

Keywords

Navigation