Skip to main content
Log in

Sketch discriminatively regularized online gradient descent classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Online learning represents an important family of efficient and scalable algorithms for large-scale classification problems. Many of them are linear with fast computational speed, but when faced with complex classification, they more likely have low accuracies. In order to improve accuracies, kernel trick is applied, however, it often brings high computational cost. In fact, discriminative information is vital in classification which is still not fully utilized in these algorithms. In this paper, we proposed a novel online linear method, called Sketch Discriminatively Regularized Online Gradient Descent Classification (SDROGD). In order to exploit inter-class separability and intra-class compactness, SDROGD utilizes a matrix to characterize the discriminative information and embeds it directly into a new regularization term. This matrix can be updated by the sketch technique in an online manner. After applying a simple but effective optimization, we show that SDROGD has a good time complexity bound, which is linear with the feature dimension or the number of samples. Experimental results on both toy and real-world datasets demonstrate that SDROGD has not only faster computational speed but also much better classification accuracies than some related kernelized algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://www.cs.toronto.edu/~kriz/cifar.html

  2. http://yann.lecun.com/exdb/mnist/

  3. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

  4. http://archive.ics.uci.edu/ml/datasets.html

References

  1. Frank R (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  2. Shai S-S (2012) Online learning and online convex optimization. Found Trends®; Mach Learn 4(2):107–194

    MATH  Google Scholar 

  3. Shi T, Zhu J (2017) Online bayesian passive-aggressive learning. J Mach Learn Res 18(1):1084–1122

    MathSciNet  MATH  Google Scholar 

  4. Koby C, Ofer D, Joseph K, Shai S-S, Yoram S (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585

    MathSciNet  MATH  Google Scholar 

  5. Shai S-S, Yoram S, Nathan S, Andrew C (2011) Pegasos: Primal estimated sub-gradient solver for svm. Math Program 127(1):3–30

    Article  MathSciNet  MATH  Google Scholar 

  6. Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37 (3):277–296

    Article  MATH  Google Scholar 

  7. Kivinen J, Smola AJ, Williamson RC (2002) Online learning with kernels. In: Advances in Neural Information Processing Systems, pp 785–792

  8. Tu DN, Le T, Bui H, Phung DQ (2017) Large-scale online kernel learning with random feature reparameterization. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp 2543–2549

  9. Wang Z, Koby C, Slobodan V (2012) Breaking the curse of kernelization Budgeted stochastic gradient descent for large-scale svm training. J Mach Learn Res 13(Oct):3103–3131

    MathSciNet  MATH  Google Scholar 

  10. Lu J, Hoi Steven CH, Wang J, Zhao P, Liu Z-Y (2016) Large scale online kernel learning. J Mach Learn Res 17(1):1613–1655

    MathSciNet  MATH  Google Scholar 

  11. Bo Y, Shao Q-M, Li P, Li W-B (2018) A study on regularized weighted least square support vector classifier. Pattern Recogn Lett 108:48–55

    Article  Google Scholar 

  12. Jian L, Shen S, Li J, Liang X, Li Lei (2017) Budget online learning algorithm for least squares svm. IEEE Trans Neural Netw Learn Syst 28(9):2076–2087

    MathSciNet  Google Scholar 

  13. Li Z, Ton J-F, Oglic D, Sejdinovic D (2018) Towards a unified analysis of random fourier features. arXiv:1806.09178

  14. Kim T-K, Wong S-F, Bjorn S, Josef K, Roberto C (2007) Incremental linear discriminant analysis using sufficient spanning set approximations. In Computer Vision and Pattern Recognition. IEEE, pp 1–8

  15. Xue H, Chen S, Yang Q (2009) Discriminatively regularized least-squares classification. Pattern Recogn 42(1):93–104

    Article  MATH  Google Scholar 

  16. Pang S, Seiichi O, Nikola K (2005) Incremental linear discriminant analysis for classification of data streams. IEEE Trans Syst Man Cybern Part B Cybern 35(5):905

    Article  Google Scholar 

  17. Ye J, Li Q, Xiong H, Haesun P, Ravi J, Kumar V (2005) Idr/qr: an incremental dimension reduction algorithm via qr decomposition. IEEE Trans Knowl Data Eng 17(9):1208–1222

    Article  Google Scholar 

  18. Li W-H, Zhong Z, Zheng W-S (2017) One-pass person re-identification by sketch online discriminant analysis. arXiv:1711.03368

  19. Edo L (2013) Simple and deterministic matrix sketching. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp581–588

  20. Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev 60(2):223–311

    Article  MathSciNet  MATH  Google Scholar 

  21. Reddi SJ, Hefny A, Sra S, Poczos B, Smola A (2016) Stochastic variance reduction for nonconvex optimization. In: International conference on machine learning, pp 314–323

  22. Kohavi R (1996) Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, vol 96, pp 202–207

  23. Alex K, Geoffrey H (2009) Learning multiple layers of features from tiny images. Technical report, Citeseer

  24. Sören S, Vojtech F (2010) Coffin: A computational framework for linear svms. In: Proceedings of the 27th International Conference on Machine Learning, pp 999–1006

  25. Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, pp 2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm

  26. Isabelle G, Steve G, Asa B-H, Gideon D (2005) Result analysis of the nips 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp 545–552

  27. Danil P (2001) Ijcnn 2001 neural network competition. Slide Present IJCNN 1:97

    Google Scholar 

  28. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  29. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  30. Xu H-M, Xue H, Chen X-H, Wang Y-Y (2017) Solving indefinite kernel support vector machine with difference of convex functions programming. In Thirty-First AAAI Conference on Artificial Intelligence

  31. Koby C, Yoram S (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2(Dec):265–292

    MATH  Google Scholar 

  32. Kamiya R, Washizawa Y (2018) Discriminative sparse representation learning using multiclass hinge loss. In: 2018 Asia-pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, pp 955–958

Download references

Acknowledgments

This work was supported by the National Key R&D Program of China (Grant No. 2017YFB1002801), the National Natural Science Foundations of China (Grant No. 61876091).

It is also supported by the Collaborative Innovation Center of Wireless Communications Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Xue.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, H., Ren, Z. Sketch discriminatively regularized online gradient descent classification. Appl Intell 50, 1367–1378 (2020). https://doi.org/10.1007/s10489-019-01590-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01590-6

Keywords

Navigation