Skip to main content
Log in

Robust metric learning based on the rescaled hinge loss

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Distance/Similarity learning is a fundamental problem in machine learning. For example, kNN classifier or clustering methods are based on a distance/similarity measure. Metric learning algorithms enhance the efficiency of these methods by learning an optimal distance function from data. Most metric learning methods need training information in the form of pair or triplet sets. Nowadays, this training information often is obtained from the Internet via crowdsourcing methods. Therefore, this information may contain label noise or outliers leading to the poor performance of the learned metric. It is even possible that the learned metric functions perform worse than the general metrics such as Euclidean distance. To address this challenge, this paper presents a new robust metric learning method based on the Rescaled Hinge loss. This loss function is a general case of the popular Hinge loss and initially introduced in Xu et al. (Pattern Recogn 63:139–148, 2017) to develop a new robust SVM algorithm. In this paper, we formulate the metric learning problem using the Rescaled Hinge loss function and then develop an efficient algorithm based on HQ (Half-Quadratic) to solve the problem. Experimental results on a variety of both real and synthetic datasets confirm that our new robust algorithm considerably outperforms state-of-the-art metric learning methods in the presence of label noise and outliers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Generalized Maximum Entropy Model for learning from Noisy side information.

  2. Robust Neighborhood Component Analysis.

  3. Bayesian Large Margin Nearest Neighbor.

  4. Sparse Bayesian Metric Learning.

  5. Robust DML.

  6. Downloaded from https://www.vlfeat.org/matconvnet/pretrained/.

  7. https://parnec.nuaa.edu.cn/xtan/data/BLMNN_demo.zip.

  8. https://parnec.nuaa.edu.cn/xtan/data/BLMNN_demo.zip.

  9. https://www.cs.cmu.edu/~deyum/Publications.htm.

  10. https://mloss.org/software/view/553/.

  11. https://tec.citius.usc.es/stac/.

References

  1. Bak S, Carr P (2017) One-shot metric learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2990–2999

  2. Bellet A, Habrard A, Sebban M (2014) a survey on metric learning for feature vectors and structured data technical report

  3. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

  4. Chechik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135

    MathSciNet  MATH  Google Scholar 

  5. Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585

    MathSciNet  MATH  Google Scholar 

  6. Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. Paper presented at the Proceedings of the 24th international conference on machine learning, corvalis, Oregon, USA,

  7. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64

    MathSciNet  MATH  Google Scholar 

  8. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701

    MATH  Google Scholar 

  9. Geman D, Yang C (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Trans Image Process 4:932–946

    Google Scholar 

  10. Goldberger J, Hinton GE, Roweis ST (2005) Salakhutdinov RR Neighbourhood components analysis. In: Advances in neural information processing systems, pp 513–520

  11. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 12th International conference on computer vision, 2009 IEEE, pp 309–316

  12. Hao X, Hoi SCH, Rong J, Peilin Z (2014) Online multiple kernel similarity learning for visual search pattern analysis and machine intelligence. IEEE Trans Pattern Anal Mach Intell 36:536–549. https://doi.org/10.1109/TPAMI.2013.149

    Article  Google Scholar 

  13. Huang K, Jin R, Xu Z, Liu C-L (2012) Robust metric learning by smooth optimization arXiv preprint arXiv:12033461

  14. Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16:550–554. https://doi.org/10.1109/34.291440

    Article  Google Scholar 

  15. Jain P, Kulis B, Davis JV, Dhillon IS (2012) Metric and kernel learning using a linear transformation. J Mach Learn Res 13:519–547

    MathSciNet  MATH  Google Scholar 

  16. Jiang N, Liu W, Wu Y (2012) Order determination and sparsity-regularized metric learning adaptive visual tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 1956–1963

  17. Jin R, Wang S, Zhou Y (2009) Regularized distance metric learning: theory and algorithm. In: Advances in neural information processing systems. pp 862–870

  18. Krishna RA, Hata K, Chen S, Kravitz J, Shamma DA, Fei-Fei L, Bernstein MS (2016) Embracing error to enable rapid crowdsourcing. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 3167–3179

  19. Kulis B (2013) Metric learning: A survey Foundations and Trends® in Machine Learning 5:287–364

  20. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Article  Google Scholar 

  21. Lee K-C, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27:684–698

    Google Scholar 

  22. Li F-F, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Conference on computer vision and pattern recognition workshop, 27–02, 2004. pp 178–178. doi: 10.1109/CVPR.2004.109

  23. Li J, Xu C, Yang W, Sun C, Tao D (2017) Discriminative multi-view interactive image re-ranking. IEEE Trans Image Process 26:3113–3127

    MathSciNet  MATH  Google Scholar 

  24. Lichman M (2013) UCI machine learning repository

  25. Lin L, Wang G, Zuo W, Feng X, Zhang L (2017) Cross-domain visual matching via generalized similarity measure and feature learning. IEEE Trans Pattern Anal Mach Intell 39:1089–1102

    Google Scholar 

  26. Nguyen B, Morell C, Baets BD (2017) Supervised distance metric learning through maximization of the Jeffrey divergence. Pattern Recogn 64:215–225. https://doi.org/10.1016/j.patcog.2016.11.010

    Article  MATH  Google Scholar 

  27. Niu G, Dai B, Yamada M, Sugiyama M (2014) Information-theoretic semi-supervised metric learning via entropy regularization. Neural Comput 26:1717–1762. https://doi.org/10.1162/NECO_a_00614

    Article  MathSciNet  MATH  Google Scholar 

  28. Rodríguez-Fdez I, Canosa A, Mucientes M, Bugarín A (2015) STAC: a web platform for the comparison of algorithms using statistical tests. In: 2015 IEEE international conference on fuzzy systems (FUZZ-IEEE), 2015. IEEE, pp 1–8

  29. Shapiro A, Wardi Y (1996) Convergence analysis of gradient descent stochastic algorithms. J Optim Theory Appl 91:439–454

    MathSciNet  MATH  Google Scholar 

  30. Shen C, Kim J, Wang L, Hengel AVD (2012) Positive semidefinite metric learning using boosting-like algorithms. J Mach Learn Res 13:1007–1036

    MathSciNet  MATH  Google Scholar 

  31. Shi Y, Bellet A, Sha F (2014) Sparse compositional metric learning. In: AAAI. pp 2078–2084

  32. Wang D, Tan X (2014) Robust distance metric learning in the presence of label noise. In: AAAI, 2014. pp 1321–1327

  33. Wang D, Tan X (2018) Robust distance metric learning via Bayesian inference. IEEE Trans Image Process 27:1542–1553

    MathSciNet  MATH  Google Scholar 

  34. Wang F, Zuo W, Zhang L, Meng D, Zhang D (2015) A kernel classification framework for metric learning. IEEE Trans Neural Netw Learn Syst 26:1950–1962

    MathSciNet  Google Scholar 

  35. Wang H, Nie F, Huang H (2014) Robust distance metric learning via simultaneous L1-Norm Minimization and Maximization. In: Jebara T, Xing EP (eds) Proceedings of the 31st international conference on machine Learning (ICML-14), 2014. JMLR Workshop and Conference Proceedings, [Formatter not found: ResolvePDF], pp 1836–1844

  36. Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  37. Wu P, Hoi SCH, Zhao P, Miao C, Liu ZY (2016) Online multi-modal distance metric learning with application to image retrieval. IEEE Trans Knowl Data Eng 28:454–467. https://doi.org/10.1109/TKDE.2015.2477296

    Article  Google Scholar 

  38. Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recogn 41:3600–3612. https://doi.org/10.1016/j.patcog.2008.05.018

    Article  MATH  Google Scholar 

  39. Xu G, Cao Z, Hu B-G, Principe JC (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recogn 63:139–148

    MATH  Google Scholar 

  40. Yang T, Jin R, Jain AK (2010) Learning from noisy side information by generalized maximum entropy model. In: Proceedings of the 27th international conference on machine learning (ICML-10). Citeseer, pp 1199–1206

  41. Yuan T, Deng W, Tang J, Tang Y, Chen B (2019) Signal-to-noise ratio: a robust distance metric for deep metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4815–4824

  42. Zabihzadeh D, Monsefi R, Yazdi HS (2019) Sparse Bayesian approach for metric learning in latent space. Knowl-Based Syst 178:11–24

    Google Scholar 

  43. Zha Z-J, Mei T, Wang M, Wang Z, Hua X-S (2009) Robust distance metric learning with auxiliary knowledge. In: Twenty-first international joint conference on artificial intelligence

Download references

Acknowledgements

We would like to acknowledge the Machine Learning Lab. in Engineering Faculty of FUM for their kind and technical support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davood Zabihzadeh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Table

Table 4 Summary of the main notations and abbreviations used throughout the paper

4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Obaidi, S.A.R., Zabihzadeh, D. & Hajiabadi, H. Robust metric learning based on the rescaled hinge loss. Int. J. Mach. Learn. & Cyber. 11, 2515–2528 (2020). https://doi.org/10.1007/s13042-020-01137-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01137-z

Keywords

Navigation