Pseudo support vector domain description to train large-size and continuously growing datasets

El Boujnouni, Mohamed

doi:10.1007/s10115-021-01606-z

Pseudo support vector domain description to train large-size and continuously growing datasets

Regular Paper
Published: 28 August 2021

Volume 63, pages 2671–2692, (2021)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Mohamed El Boujnouni ORCID: orcid.org/0000-0002-8348-3153¹

187 Accesses
1 Altmetric
Explore all metrics

Abstract

Support vector domain description (SVDD) is a data description method inspired by support vector machine (SVM). This classifier describes a set of data points with a sphere that encloses the majority of them and has a minimal volume. The boundary of this sphere is used to classify new samples. SVDD has been successfully applied to many challenging classification problems and has shown a good generalization capability. However, this classifier still has some major weaknesses. This paper focuses on two of them: The first regards the large amount of memory and computational time required by SVDD in the training step. This problem manifests most strongly when dealing with large-size datasets and can hinder or prevent its use. This paper presents an approximate solution to this problem that permits to apply SVDD to large-scale datasets. This new version is based on divide-and-conquer strategy and it processes in two steps: It begins by dividing the whole large-size dataset into random subsets that each can be described efficiently with a small sphere using SVDD. Then, it applies our new algorithm that can find the smallest sphere that encloses the minimal spheres built in the previous step. The second weak point of standard SVDD concerns its static learning process. This classifier must be re-trained with the whole dataset each time when new training samples are available. This paper proposes a new dynamic approach that only trains the new samples with SVDD and incorporates the resulting minimal sphere with the previous one (s) to construct the smallest sphere that encloses all the samples. Like Support Vector Domain Description, the proposed approach can be extended to non-linear classification cases by using kernel functions. Experimental results on artificial and real datasets have successfully validated the performance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast classification strategy for SVM on the large-scale high-dimensional datasets

Article 18 April 2017

I-Jing Li, Jiunn-Lin Wu & Chih-Hung Yeh

Parallel Support Vector Data Description

Fast and Memory-Efficient Import Vector Domain Description

Article 26 May 2020

Sergio Decherchi & Andrea Cavalli

References

Allahyari Y, Sadoghi-Yazdi H (2012) Quasi support vector data description (QSVDD). Int J Signal Process Image Process Pattern Recogn 5(3):65–74
Google Scholar
Chaudhuri A, Sadek C, Kakde D et al (2021) The trace kernel bandwidth criterion for support vector data description. Pattern Recogn 111:107662
Article Google Scholar
Chen X, Cao C, Mai J (2020) Network anomaly detection based on deep support vector data description. In: Proceedings of the 5th IEEE international conference on big data analytics (ICBDA), Xiamen, China
Chu SC, Tsang IW, Kwok JT (2004) Scaling up support vector data description by using coresets. In: Proceedings of the international joint conference on neural networks, Budapest, Hungary, pp 425–430
Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20:273–297
MATH Google Scholar
Dai T (2018) Personal credit assessment based on improved SVDD algorithm. IOP Conf Ser Mater Sci Eng 439(4):042003
Article Google Scholar
El Boujnouni M, Jedra M, Zahid N (2014) Support vector domain description with maximum between spheres separability. J Inf Sci Eng 30(6):1985–2002
Google Scholar
Ghasemigol M, Reza M, Hadi SY (2009), Ellipse support vector data description. In: 11th international conference on engineering applications of neural networks-EANN
Hao PY, Lin YH (2007) A new multi-class support vector machine with multi-sphere in the feature space. In: HG Okuno, M Ali (Eds) IEA/AIE 2007, LNAI 4570, pp 756–765
Jinglong F, Wanliang W, Xingqi W et al (2012) A SVDD method basedon maximum distance between two centers of spheres. Chin J Electron 21(1):107–111
Google Scholar
Lazli L, Boukadoum M, Ait MO (2019) Computer-aided diagnosis system of Alzheimer’s disease based on multimodal fusion: tissue quantification based on the hybrid fuzzy-genetic-possibilistic model and discriminative classification based on the SVDD model. Brain Sci 9(10):289. https://doi.org/10.3390/brainsci9100289
Article Google Scholar
Lee KY, Kim DW, Lee KH et al (2007) Density-induced support vector data description. IEEE Trans Neural Netw 18(1):284–289
Article Google Scholar
Le T, Tran D, Hoang T et al. (2012) A unified model for support vector machine and support vector data description. In: The international joint conference on neural networks (IJCNN). Brisbane, Australia
Liang J, Liu S, Wu D (2009) Fast training of SVDD by extracting boundary targets. Iran J Electr Comput Eng 8(2):133–137
Google Scholar
Liu Y, Zheng YF (2006) Minimum enclosing and maximum excluding machine for pattern description and discrimination. Proc Int Conf Pattern Recogn 3:129–132
Google Scholar
Mu T, Nandi AK (2009) Multiclass classification based on extended support vector data description. IEEE Trans Syst Man Cybern Part B 39(5):1206–1216
Article Google Scholar
Mygdalis V, Tefas A, Pitas I (2020) K-Anonymity inspired adversarial attack and multiple one-class classification defense. Neural Netw 124:296–307
Article Google Scholar
Perera P, Patel VM (2019) Learning deep features for one-class classification. IEEE Trans Image Process 28(11):1–15
Article MathSciNet Google Scholar
Qu H, Zhao J, Zhao J et al. (2019), Towards support vector data description based on heuristic sample condensed rule. In: Chinese control and decision conference (CCDC), Nanchang, China, pp 4647–4653. https://doi.org/10.1109/CCDC.2019.8833182
Slimene A, Zagrouba E (2019) Towards fast and parameter-independent support vector data description for image and video segmentation. Expert Syst Appl 128:271–286
Article Google Scholar
Sun QY, Sun YM, Liu XJ et al (2019) Study on fault diagnosis algorithm in WSN nodes based on RPCA model and SVDD for multi-class classification. Clust Comput 22(1):6043–6057
Article Google Scholar
Tax D, Duin R (1999) Data domain description using support vectors. In: Proceedings of European symposium on artificial neural networks, pp 251–256
Tax D, Duin R (2004) Support vector data description. Mach Learn 54:45–66
Article Google Scholar
UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/
Vapnik V (1979) Estimation of dependences based on empirical data. Nauka, Moscow
MATH Google Scholar
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Wang J, Liu W, Qiu K et al (2019) Dynamic hypersphere SVDD without describing boundary for one-class classification. Neural Comput Appl 31:3295–3305
Article Google Scholar
Wang J, Neskovic P, Cooper LN (2005) Pattern classification via single spheres. Comput Sci 3735:241–252
Google Scholar
Wang K, Lan H (2020) Robust support vector data description for novelty detection with contaminated data. Eng Appl Artif Intell 91:103554
Article Google Scholar
Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092
Article Google Scholar
Wu Q, Shen X, Li Y et al (2005) Classifying the multiplicity of the EEG source models using sphere-shaped support vector machines. IEEE Trans Magn 41(5):1912–1915
Article Google Scholar
Zeng QS, Huang XY, Xiang XH et al (2019) Kernel analysis based on SVDD for face recognition from image set. J Intell Fuzzy Syst 36(6):5499–5511
Article Google Scholar
Zheng S (2019) A fast iterative algorithm for support vector data description. Int J Mach Learn Cybern 10(5):1173–1187
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Information Technologies, National School of Applied Sciences, Chouaib Doukkali University, El Jadida, Morocco
Mohamed El Boujnouni

Authors

Mohamed El Boujnouni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed El Boujnouni.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

El Boujnouni, M. Pseudo support vector domain description to train large-size and continuously growing datasets. Knowl Inf Syst 63, 2671–2692 (2021). https://doi.org/10.1007/s10115-021-01606-z

Download citation

Received: 06 May 2020
Revised: 05 August 2021
Accepted: 09 August 2021
Published: 28 August 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10115-021-01606-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pseudo support vector domain description to train large-size and continuously growing datasets

Abstract

Access this article

Similar content being viewed by others

A fast classification strategy for SVM on the large-scale high-dimensional datasets

Parallel Support Vector Data Description

Fast and Memory-Efficient Import Vector Domain Description

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pseudo support vector domain description to train large-size and continuously growing datasets

Abstract

Access this article

Similar content being viewed by others

A fast classification strategy for SVM on the large-scale high-dimensional datasets

Parallel Support Vector Data Description

Fast and Memory-Efficient Import Vector Domain Description

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation