Skip to main content
Log in

Pseudo support vector domain description to train large-size and continuously growing datasets

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Support vector domain description (SVDD) is a data description method inspired by support vector machine (SVM). This classifier describes a set of data points with a sphere that encloses the majority of them and has a minimal volume. The boundary of this sphere is used to classify new samples. SVDD has been successfully applied to many challenging classification problems and has shown a good generalization capability. However, this classifier still has some major weaknesses. This paper focuses on two of them: The first regards the large amount of memory and computational time required by SVDD in the training step. This problem manifests most strongly when dealing with large-size datasets and can hinder or prevent its use. This paper presents an approximate solution to this problem that permits to apply SVDD to large-scale datasets. This new version is based on divide-and-conquer strategy and it processes in two steps: It begins by dividing the whole large-size dataset into random subsets that each can be described efficiently with a small sphere using SVDD. Then, it applies our new algorithm that can find the smallest sphere that encloses the minimal spheres built in the previous step. The second weak point of standard SVDD concerns its static learning process. This classifier must be re-trained with the whole dataset each time when new training samples are available. This paper proposes a new dynamic approach that only trains the new samples with SVDD and incorporates the resulting minimal sphere with the previous one (s) to construct the smallest sphere that encloses all the samples. Like Support Vector Domain Description, the proposed approach can be extended to non-linear classification cases by using kernel functions. Experimental results on artificial and real datasets have successfully validated the performance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Allahyari Y, Sadoghi-Yazdi H (2012) Quasi support vector data description (QSVDD). Int J Signal Process Image Process Pattern Recogn 5(3):65–74

    Google Scholar 

  2. Chaudhuri A, Sadek C, Kakde D et al (2021) The trace kernel bandwidth criterion for support vector data description. Pattern Recogn 111:107662

    Article  Google Scholar 

  3. Chen X, Cao C, Mai J (2020) Network anomaly detection based on deep support vector data description. In: Proceedings of the 5th IEEE international conference on big data analytics (ICBDA), Xiamen, China

  4. Chu SC, Tsang IW, Kwok JT (2004) Scaling up support vector data description by using coresets. In: Proceedings of the international joint conference on neural networks, Budapest, Hungary, pp 425–430

  5. Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20:273–297

    MATH  Google Scholar 

  6. Dai T (2018) Personal credit assessment based on improved SVDD algorithm. IOP Conf Ser Mater Sci Eng 439(4):042003

    Article  Google Scholar 

  7. El Boujnouni M, Jedra M, Zahid N (2014) Support vector domain description with maximum between spheres separability. J Inf Sci Eng 30(6):1985–2002

    Google Scholar 

  8. Ghasemigol M, Reza M, Hadi SY (2009), Ellipse support vector data description. In: 11th international conference on engineering applications of neural networks-EANN

  9. Hao PY, Lin YH (2007) A new multi-class support vector machine with multi-sphere in the feature space. In: HG Okuno, M Ali (Eds) IEA/AIE 2007, LNAI 4570, pp 756–765

  10. Jinglong F, Wanliang W, Xingqi W et al (2012) A SVDD method basedon maximum distance between two centers of spheres. Chin J Electron 21(1):107–111

    Google Scholar 

  11. Lazli L, Boukadoum M, Ait MO (2019) Computer-aided diagnosis system of Alzheimer’s disease based on multimodal fusion: tissue quantification based on the hybrid fuzzy-genetic-possibilistic model and discriminative classification based on the SVDD model. Brain Sci 9(10):289. https://doi.org/10.3390/brainsci9100289

    Article  Google Scholar 

  12. Lee KY, Kim DW, Lee KH et al (2007) Density-induced support vector data description. IEEE Trans Neural Netw 18(1):284–289

    Article  Google Scholar 

  13. Le T, Tran D, Hoang T et al. (2012) A unified model for support vector machine and support vector data description. In: The international joint conference on neural networks (IJCNN). Brisbane, Australia

  14. Liang J, Liu S, Wu D (2009) Fast training of SVDD by extracting boundary targets. Iran J Electr Comput Eng 8(2):133–137

    Google Scholar 

  15. Liu Y, Zheng YF (2006) Minimum enclosing and maximum excluding machine for pattern description and discrimination. Proc Int Conf Pattern Recogn 3:129–132

    Google Scholar 

  16. Mu T, Nandi AK (2009) Multiclass classification based on extended support vector data description. IEEE Trans Syst Man Cybern Part B 39(5):1206–1216

    Article  Google Scholar 

  17. Mygdalis V, Tefas A, Pitas I (2020) K-Anonymity inspired adversarial attack and multiple one-class classification defense. Neural Netw 124:296–307

    Article  Google Scholar 

  18. Perera P, Patel VM (2019) Learning deep features for one-class classification. IEEE Trans Image Process 28(11):1–15

    Article  MathSciNet  Google Scholar 

  19. Qu H, Zhao J, Zhao J et al. (2019), Towards support vector data description based on heuristic sample condensed rule. In: Chinese control and decision conference (CCDC), Nanchang, China, pp 4647–4653. https://doi.org/10.1109/CCDC.2019.8833182

  20. Slimene A, Zagrouba E (2019) Towards fast and parameter-independent support vector data description for image and video segmentation. Expert Syst Appl 128:271–286

    Article  Google Scholar 

  21. Sun QY, Sun YM, Liu XJ et al (2019) Study on fault diagnosis algorithm in WSN nodes based on RPCA model and SVDD for multi-class classification. Clust Comput 22(1):6043–6057

    Article  Google Scholar 

  22. Tax D, Duin R (1999) Data domain description using support vectors. In: Proceedings of European symposium on artificial neural networks, pp 251–256

  23. Tax D, Duin R (2004) Support vector data description. Mach Learn 54:45–66

    Article  Google Scholar 

  24. UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/

  25. Vapnik V (1979) Estimation of dependences based on empirical data. Nauka, Moscow

    MATH  Google Scholar 

  26. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  27. Wang J, Liu W, Qiu K et al (2019) Dynamic hypersphere SVDD without describing boundary for one-class classification. Neural Comput Appl 31:3295–3305

    Article  Google Scholar 

  28. Wang J, Neskovic P, Cooper LN (2005) Pattern classification via single spheres. Comput Sci 3735:241–252

    Google Scholar 

  29. Wang K, Lan H (2020) Robust support vector data description for novelty detection with contaminated data. Eng Appl Artif Intell 91:103554

    Article  Google Scholar 

  30. Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092

    Article  Google Scholar 

  31. Wu Q, Shen X, Li Y et al (2005) Classifying the multiplicity of the EEG source models using sphere-shaped support vector machines. IEEE Trans Magn 41(5):1912–1915

    Article  Google Scholar 

  32. Zeng QS, Huang XY, Xiang XH et al (2019) Kernel analysis based on SVDD for face recognition from image set. J Intell Fuzzy Syst 36(6):5499–5511

    Article  Google Scholar 

  33. Zheng S (2019) A fast iterative algorithm for support vector data description. Int J Mach Learn Cybern 10(5):1173–1187

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed El Boujnouni.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

El Boujnouni, M. Pseudo support vector domain description to train large-size and continuously growing datasets. Knowl Inf Syst 63, 2671–2692 (2021). https://doi.org/10.1007/s10115-021-01606-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01606-z

Keywords

Navigation