Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification

Gautam, Chandan; Tiwari, Aruna; Mishra, Pratik K.; Suresh, Sundaram; Iosifidis, Alexandros; Tanveer, M.

doi:10.1007/s12559-020-09804-7

Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification

Published: 18 January 2021

Volume 13, pages 552–569, (2021)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Chandan Gautam ORCID: orcid.org/0000-0002-4543-6495¹,
Aruna Tiwari¹,
Pratik K. Mishra¹,
Sundaram Suresh²,
Alexandros Iosifidis³ &
…
M. Tanveer⁴

435 Accesses
2 Citations
Explore all metrics

Abstract

Humans can detect outliers just by using only observations of normal samples. Similarly, one-class classification (OCC) uses only normal samples to train a classification model which can be used for outlier detection. This paper proposes a multi-layer architecture for OCC by stacking various graph-embedded kernel ridge regression (KRR)-based autoencoders in a hierarchical fashion. We formulate the autoencoders under the graph-embedding framework to exploit local and global variance criteria. The use of multiple autoencoder layers allows us to project the input features into a new feature space on which we apply a graph-embedded regression-based one-class classifier. We build the proposed hierarchical OCC architecture in a progressive manner and optimize the parameters of each of the successive layers based on closed-form solutions. The performance of the proposed method is evaluated on 21 balanced and 20 imbalanced datasets. The effectiveness of the proposed method is indicated by the experimental results over 11 existing state-of-the-art kernel-based one-class classifiers. Friedman test is also performed to verify the statistical significance of the obtained results. By using two types of graph-embedding, 4 variants of graph-embedded multi-layer KRR-based one-class classification methods are presented in this paper. All 4 variants have performed better than the existing one-class classifiers in terms of the various performance metrics. Hence, they can be a viable alternative for OCC for a wide range of one-class classification tasks. As a future extension, various other autoencoder variants can be applied within the proposed architecture to increase efficiency and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature dimensionality reduction: a review

Article Open access 21 January 2022

A survey on ensemble learning

Article 30 August 2019

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Notes

One-class classifiers are also known as data descriptors due to their capability to describe the distribution of data and the boundaries of the class of interest
Some researchers [21, 22] followed the name of kernel extreme learning machine (KELM) [24], and some researchers followed the name of KRR [16, 19] (instead of KELM). We do not want to go in the debate of the naming convention. Since there are no differences in the final solution of KELM and KRR, we decide to follow the traditional name KRR instead of KELM.
Here, “/” denotes or. GMKOC uses GKAE and LMKOC uses LKAE.
Here, OCSVM and SVDD yield best results for the same dataset, i.e., Iono(1) dataset.

References

Moya M M, Koch M W, Hostetler L D. One-class classifier networks for target recognition applications. Albuquerque: Technical report, Sandia National Labs.; 1993.
Google Scholar
Khan S S, Madden M G. A survey of recent trends in one class classification. Irish conference on Artificial Intelligence and Cognitive Science. Springer; 2009. p. 188–197.
Pimentel M A, Clifton D A, Clifton L, Tarassenko L. A review of novelty detection. Signal Process 2014;99:215–249.
Article Google Scholar
Xu Y, Liu C. A rough margin-based one class support vector machine. Neural Comput Appl 2013;22(6):1077–1084.
Article Google Scholar
Hamidzadeh J, Moradi M. Improved one-class classification using filled function Appl Intell. 2018:1–17.
Xiao Y, Liu B, Cao L, Wu X, Zhang C, Hao Z, Yang F, Cao J. Multi-sphere support vector data description for outliers detection on multi-distribution data. IEEE International Conference on Data Mining Workshops, 2009 (ICDMW’09).. IEEE; 2009 . p. 82–87.
Tax D M J. One-class classification; concept-learning in the absence of counter-examples. ASCI dissertation series. 2001;65.
Liu B, Xiao Y, Cao L, Hao Z, Deng F. Svdd-based outlier detection on uncertain data. Knowl Inf Syst 2013;34(3):597–618.
Article Google Scholar
Hu W, Wang S, Chung F-L, Liu Y, Ying W. Privacy preserving and fast decision for novelty detection using support vector data description. Soft Comput 2015;19(5):1171–1186.
Article Google Scholar
O’Reilly C, Gluhak A, Imran M A, Rajasegarar S. Anomaly detection in wireless sensor networks in a non-stationary environment. IEEE Commun Surv Tutorials 2014;16(3):1413–1432.
Article Google Scholar
Tax D MJ, Duin R PW. Support vector data description. Mach Learn 2004;54(1):45–66.
Article Google Scholar
Schölkopf B, Williamson R C, Smola A J, Shawe-Taylor J, Platt J C. Support vector method for novelty detection. Advances in Neural Information Processing Systems; 1999. p. 582–588.
Hoffmann H. Kernel PCA for novelty detection. Pattern Recogn 2007;40(3):863–874. Software available at http://www.heikohoffmann.de/kpca.html.
Article Google Scholar
Kriegel H-P, Zimek A, et al. Angle-based outlier detection in high-dimensional data. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2008. p. 444–452.
Japkowicz N. Concept-learning in the absence of counter-examples: An autoassociation-based approach to classification. Ph.D. Thesis. Rutgers: The State University of New Jersey; 1999.
Google Scholar
Gautam C, Tiwari A, Tanveer M. AEKOC+: Kernel ridge regression-based auto-encoder for one-class classification using privileged information. Cognitive Computation. 2020:1–14.
Saunders C, Gammerman A, Vovk V. Ridge regression learning algorithm in dual variables. Proceedings of the Fifteenth International Conference on Machine Learning, ICML ’98. San Francisco: Morgan Kaufmann Publishers Inc.; 1998. p. 515–521.
Wornyo D K, Shen X-J, Dong Y, Wang L, Huang S-C. Co-regularized kernel ensemble regression. World Wide Web. 2018;1–18.
Zhang L, Suganthan P N. Benchmarking ensemble classifiers with novel co-trained kernel ridge regression and random vector functional link ensembles [research frontier]. IEEE Comput Intell Mag 2017;12(4): 61–72.
Article Google Scholar
He J, Ding L, Jiang L, Ma L. Kernel ridge regression classification. Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE; 2014. p. 2263–2267.
Leng Q, Qi H, Miao J, Zhu W, Su G. One-class classification with extreme learning machine. Math Probl Eng. 2014;1–11.
Gautam C, Tiwari A, Leng Q. On the construction of extreme learning machine for online and offline one-class classification-an expanded toolbox. Neurocomputing 2017;261:126–143. Software available at https://github.com/Chandan-IITI/One-Class-Kernel-ELM.
Article Google Scholar
Gautam C, Tiwari A, Suresh S, Ahuja K. Adaptive online learning with regularized kernel for one-class classification. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2019;1–16.
Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 2011;42(2):513–529.
Article Google Scholar
Iosifidis A, Mygdalis V, Tefas A, Pitas I. One-class classification based on extreme learning and geometric class information. Neural Process Lett. 2016;1–16.
Mygdalis V, Iosifidis A, Tefas A, Pitas I. Exploiting subclass information in one-class support vector machine for video summarization. IEEE International Conference on Acoustics, Speech and Signal Processing. 2015.
Mygdalis V, Iosifidis A, Tefas A, Pitas I. One class classification applied in facial image analysis. IEEE International Conference on Image Processing (ICIP). IEEE; 2016. p. 1644–1648.
Kasun L L C, Zhou H, Huang G-B, Vong C M. Representational learning with extreme learning machine for big data. IEEE Intell Syst 2013;28(6):31–34.
Google Scholar
Wong C M, Vong C M, Wong P K, Cao J. Kernel-based multilayer extreme learning machines for representation learning. IEEE Trans Neural Netw Learn Syst 2018;29(3):757–762.
Article MathSciNet Google Scholar
Jose C, Goyal P, Aggrwal P, Varma M. Local deep kernel learning for efficient non-linear svm prediction. International Conference on Machine Learning; 2013. p. 486–494.
Wilson A G, Hu Z, Salakhutdinov R, Xing E P. Deep kernel learning. Artificial Intelligence and Statistics; 2016. p. 370–378.
Yan S, Xu D, Zhang B, Zhang H-J, Yang Q, Lin S. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE transactions on pattern analysis and machine intelligence. 2007;29(1).
Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems. J Mach Learn Res 2014;15(1):3133–3181.
MathSciNet MATH Google Scholar
Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 2003;15(6):1373–1396.
Article Google Scholar
Saul L K, Roweis S T. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 2003;4:119–155.
MathSciNet MATH Google Scholar
Boyer C, Chambolle A, Castro Y D, Duval V, De Gournay F, Weiss P. On representer theorems and convex regularization. SIAM J Optim 2019;29(2):1260–1281.
Article MathSciNet Google Scholar
Duda R O, Hart P E, Stork D G, et al., Vol. 2. Pattern classification. New York: Wiley; 1973.
Google Scholar
Lichman M. 2013. UCI machine learning repository.
Tax D M J, Duin R P W. Support vector domain description. Pattern Recogn Lett 1999;20 (11):1191–1199.
Article Google Scholar
Chang C-C, Lin C-J. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011;2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Article Google Scholar
Tax D M J. 2015. DDtools, the data description toolbox for MATLAB, version 2.1.2.
Iman R L, Davenport J M. Approximations of the critical region of the fbietkan statistic. Commun Stat-Theory Methods 1980;9(6):571–595.
Article Google Scholar

Download references

Funding

This research was supported by Department of Electronics and Information Technology (DeITY, Govt. of India) under Visvesvaraya PhD scheme for electronics & IT.

Author information

Authors and Affiliations

Discipline of Computer Science and Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, India
Chandan Gautam, Aruna Tiwari & Pratik K. Mishra
Indian Institute of Science, Bangalore, India
Sundaram Suresh
Department of Engineering, Aarhus University, Aarhus, Denmark
Alexandros Iosifidis
Discipline of Mathematics, Indian Institute of Technology Indore, Simrol, Indore, 453552, India
M. Tanveer

Authors

Chandan Gautam
View author publications
You can also search for this author in PubMed Google Scholar
Aruna Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Pratik K. Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Sundaram Suresh
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Iosifidis
View author publications
You can also search for this author in PubMed Google Scholar
M. Tanveer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chandan Gautam.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gautam, C., Tiwari, A., Mishra, P.K. et al. Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification. Cogn Comput 13, 552–569 (2021). https://doi.org/10.1007/s12559-020-09804-7

Download citation

Received: 23 November 2018
Accepted: 02 December 2020
Published: 18 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s12559-020-09804-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation