Skip to main content
Log in

Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Humans can detect outliers just by using only observations of normal samples. Similarly, one-class classification (OCC) uses only normal samples to train a classification model which can be used for outlier detection. This paper proposes a multi-layer architecture for OCC by stacking various graph-embedded kernel ridge regression (KRR)-based autoencoders in a hierarchical fashion. We formulate the autoencoders under the graph-embedding framework to exploit local and global variance criteria. The use of multiple autoencoder layers allows us to project the input features into a new feature space on which we apply a graph-embedded regression-based one-class classifier. We build the proposed hierarchical OCC architecture in a progressive manner and optimize the parameters of each of the successive layers based on closed-form solutions. The performance of the proposed method is evaluated on 21 balanced and 20 imbalanced datasets. The effectiveness of the proposed method is indicated by the experimental results over 11 existing state-of-the-art kernel-based one-class classifiers. Friedman test is also performed to verify the statistical significance of the obtained results. By using two types of graph-embedding, 4 variants of graph-embedded multi-layer KRR-based one-class classification methods are presented in this paper. All 4 variants have performed better than the existing one-class classifiers in terms of the various performance metrics. Hence, they can be a viable alternative for OCC for a wide range of one-class classification tasks. As a future extension, various other autoencoder variants can be applied within the proposed architecture to increase efficiency and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. One-class classifiers are also known as data descriptors due to their capability to describe the distribution of data and the boundaries of the class of interest

  2. Some researchers [21, 22] followed the name of kernel extreme learning machine (KELM) [24], and some researchers followed the name of KRR [16, 19] (instead of KELM). We do not want to go in the debate of the naming convention. Since there are no differences in the final solution of KELM and KRR, we decide to follow the traditional name KRR instead of KELM.

  3. Here, “/” denotes or. GMKOC uses GKAE and LMKOC uses LKAE.

  4. Here, OCSVM and SVDD yield best results for the same dataset, i.e., Iono(1) dataset.

References

  1. Moya M M, Koch M W, Hostetler L D. One-class classifier networks for target recognition applications. Albuquerque: Technical report, Sandia National Labs.; 1993.

    Google Scholar 

  2. Khan S S, Madden M G. A survey of recent trends in one class classification. Irish conference on Artificial Intelligence and Cognitive Science. Springer; 2009. p. 188–197.

  3. Pimentel M A, Clifton D A, Clifton L, Tarassenko L. A review of novelty detection. Signal Process 2014;99:215–249.

    Article  Google Scholar 

  4. Xu Y, Liu C. A rough margin-based one class support vector machine. Neural Comput Appl 2013;22(6):1077–1084.

    Article  Google Scholar 

  5. Hamidzadeh J, Moradi M. Improved one-class classification using filled function Appl Intell. 2018:1–17.

  6. Xiao Y, Liu B, Cao L, Wu X, Zhang C, Hao Z, Yang F, Cao J. Multi-sphere support vector data description for outliers detection on multi-distribution data. IEEE International Conference on Data Mining Workshops, 2009 (ICDMW’09).. IEEE; 2009 . p. 82–87.

  7. Tax D M J. One-class classification; concept-learning in the absence of counter-examples. ASCI dissertation series. 2001;65.

  8. Liu B, Xiao Y, Cao L, Hao Z, Deng F. Svdd-based outlier detection on uncertain data. Knowl Inf Syst 2013;34(3):597–618.

    Article  Google Scholar 

  9. Hu W, Wang S, Chung F-L, Liu Y, Ying W. Privacy preserving and fast decision for novelty detection using support vector data description. Soft Comput 2015;19(5):1171–1186.

    Article  Google Scholar 

  10. O’Reilly C, Gluhak A, Imran M A, Rajasegarar S. Anomaly detection in wireless sensor networks in a non-stationary environment. IEEE Commun Surv Tutorials 2014;16(3):1413–1432.

    Article  Google Scholar 

  11. Tax D MJ, Duin R PW. Support vector data description. Mach Learn 2004;54(1):45–66.

    Article  Google Scholar 

  12. Schölkopf B, Williamson R C, Smola A J, Shawe-Taylor J, Platt J C. Support vector method for novelty detection. Advances in Neural Information Processing Systems; 1999. p. 582–588.

  13. Hoffmann H. Kernel PCA for novelty detection. Pattern Recogn 2007;40(3):863–874. Software available at http://www.heikohoffmann.de/kpca.html.

    Article  Google Scholar 

  14. Kriegel H-P, Zimek A, et al. Angle-based outlier detection in high-dimensional data. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2008. p. 444–452.

  15. Japkowicz N. Concept-learning in the absence of counter-examples: An autoassociation-based approach to classification. Ph.D. Thesis. Rutgers: The State University of New Jersey; 1999.

    Google Scholar 

  16. Gautam C, Tiwari A, Tanveer M. AEKOC+: Kernel ridge regression-based auto-encoder for one-class classification using privileged information. Cognitive Computation. 2020:1–14.

  17. Saunders C, Gammerman A, Vovk V. Ridge regression learning algorithm in dual variables. Proceedings of the Fifteenth International Conference on Machine Learning, ICML ’98. San Francisco: Morgan Kaufmann Publishers Inc.; 1998. p. 515–521.

  18. Wornyo D K, Shen X-J, Dong Y, Wang L, Huang S-C. Co-regularized kernel ensemble regression. World Wide Web. 2018;1–18.

  19. Zhang L, Suganthan P N. Benchmarking ensemble classifiers with novel co-trained kernel ridge regression and random vector functional link ensembles [research frontier]. IEEE Comput Intell Mag 2017;12(4): 61–72.

    Article  Google Scholar 

  20. He J, Ding L, Jiang L, Ma L. Kernel ridge regression classification. Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE; 2014. p. 2263–2267.

  21. Leng Q, Qi H, Miao J, Zhu W, Su G. One-class classification with extreme learning machine. Math Probl Eng. 2014;1–11.

  22. Gautam C, Tiwari A, Leng Q. On the construction of extreme learning machine for online and offline one-class classification-an expanded toolbox. Neurocomputing 2017;261:126–143. Software available at https://github.com/Chandan-IITI/One-Class-Kernel-ELM.

    Article  Google Scholar 

  23. Gautam C, Tiwari A, Suresh S, Ahuja K. Adaptive online learning with regularized kernel for one-class classification. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2019;1–16.

  24. Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 2011;42(2):513–529.

    Article  Google Scholar 

  25. Iosifidis A, Mygdalis V, Tefas A, Pitas I. One-class classification based on extreme learning and geometric class information. Neural Process Lett. 2016;1–16.

  26. Mygdalis V, Iosifidis A, Tefas A, Pitas I. Exploiting subclass information in one-class support vector machine for video summarization. IEEE International Conference on Acoustics, Speech and Signal Processing. 2015.

  27. Mygdalis V, Iosifidis A, Tefas A, Pitas I. One class classification applied in facial image analysis. IEEE International Conference on Image Processing (ICIP). IEEE; 2016. p. 1644–1648.

  28. Kasun L L C, Zhou H, Huang G-B, Vong C M. Representational learning with extreme learning machine for big data. IEEE Intell Syst 2013;28(6):31–34.

    Google Scholar 

  29. Wong C M, Vong C M, Wong P K, Cao J. Kernel-based multilayer extreme learning machines for representation learning. IEEE Trans Neural Netw Learn Syst 2018;29(3):757–762.

    Article  MathSciNet  Google Scholar 

  30. Jose C, Goyal P, Aggrwal P, Varma M. Local deep kernel learning for efficient non-linear svm prediction. International Conference on Machine Learning; 2013. p. 486–494.

  31. Wilson A G, Hu Z, Salakhutdinov R, Xing E P. Deep kernel learning. Artificial Intelligence and Statistics; 2016. p. 370–378.

  32. Yan S, Xu D, Zhang B, Zhang H-J, Yang Q, Lin S. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE transactions on pattern analysis and machine intelligence. 2007;29(1).

  33. Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems. J Mach Learn Res 2014;15(1):3133–3181.

    MathSciNet  MATH  Google Scholar 

  34. Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 2003;15(6):1373–1396.

    Article  Google Scholar 

  35. Saul L K, Roweis S T. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 2003;4:119–155.

    MathSciNet  MATH  Google Scholar 

  36. Boyer C, Chambolle A, Castro Y D, Duval V, De Gournay F, Weiss P. On representer theorems and convex regularization. SIAM J Optim 2019;29(2):1260–1281.

    Article  MathSciNet  Google Scholar 

  37. Duda R O, Hart P E, Stork D G, et al., Vol. 2. Pattern classification. New York: Wiley; 1973.

    Google Scholar 

  38. Lichman M. 2013. UCI machine learning repository.

  39. Tax D M J, Duin R P W. Support vector domain description. Pattern Recogn Lett 1999;20 (11):1191–1199.

    Article  Google Scholar 

  40. Chang C-C, Lin C-J. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011;2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

    Article  Google Scholar 

  41. Tax D M J. 2015. DDtools, the data description toolbox for MATLAB, version 2.1.2.

  42. Iman R L, Davenport J M. Approximations of the critical region of the fbietkan statistic. Commun Stat-Theory Methods 1980;9(6):571–595.

    Article  Google Scholar 

Download references

Funding

This research was supported by Department of Electronics and Information Technology (DeITY, Govt. of India) under Visvesvaraya PhD scheme for electronics & IT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandan Gautam.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gautam, C., Tiwari, A., Mishra, P.K. et al. Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification. Cogn Comput 13, 552–569 (2021). https://doi.org/10.1007/s12559-020-09804-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-020-09804-7

Keywords

Navigation