Abstract
Distance/Similarity learning is a fundamental problem in machine learning. For example, kNN classifier or clustering methods are based on a distance/similarity measure. Metric learning algorithms enhance the efficiency of these methods by learning an optimal distance function from data. Most metric learning methods need training information in the form of pair or triplet sets. Nowadays, this training information often is obtained from the Internet via crowdsourcing methods. Therefore, this information may contain label noise or outliers leading to the poor performance of the learned metric. It is even possible that the learned metric functions perform worse than the general metrics such as Euclidean distance. To address this challenge, this paper presents a new robust metric learning method based on the Rescaled Hinge loss. This loss function is a general case of the popular Hinge loss and initially introduced in Xu et al. (Pattern Recogn 63:139–148, 2017) to develop a new robust SVM algorithm. In this paper, we formulate the metric learning problem using the Rescaled Hinge loss function and then develop an efficient algorithm based on HQ (Half-Quadratic) to solve the problem. Experimental results on a variety of both real and synthetic datasets confirm that our new robust algorithm considerably outperforms state-of-the-art metric learning methods in the presence of label noise and outliers.
Similar content being viewed by others
Notes
Generalized Maximum Entropy Model for learning from Noisy side information.
Robust Neighborhood Component Analysis.
Bayesian Large Margin Nearest Neighbor.
Sparse Bayesian Metric Learning.
Robust DML.
Downloaded from https://www.vlfeat.org/matconvnet/pretrained/.
References
Bak S, Carr P (2017) One-shot metric learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2990–2999
Bellet A, Habrard A, Sebban M (2014) a survey on metric learning for feature vectors and structured data technical report
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Chechik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135
Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. Paper presented at the Proceedings of the 24th international conference on machine learning, corvalis, Oregon, USA,
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Geman D, Yang C (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Trans Image Process 4:932–946
Goldberger J, Hinton GE, Roweis ST (2005) Salakhutdinov RR Neighbourhood components analysis. In: Advances in neural information processing systems, pp 513–520
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 12th International conference on computer vision, 2009 IEEE, pp 309–316
Hao X, Hoi SCH, Rong J, Peilin Z (2014) Online multiple kernel similarity learning for visual search pattern analysis and machine intelligence. IEEE Trans Pattern Anal Mach Intell 36:536–549. https://doi.org/10.1109/TPAMI.2013.149
Huang K, Jin R, Xu Z, Liu C-L (2012) Robust metric learning by smooth optimization arXiv preprint arXiv:12033461
Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16:550–554. https://doi.org/10.1109/34.291440
Jain P, Kulis B, Davis JV, Dhillon IS (2012) Metric and kernel learning using a linear transformation. J Mach Learn Res 13:519–547
Jiang N, Liu W, Wu Y (2012) Order determination and sparsity-regularized metric learning adaptive visual tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 1956–1963
Jin R, Wang S, Zhou Y (2009) Regularized distance metric learning: theory and algorithm. In: Advances in neural information processing systems. pp 862–870
Krishna RA, Hata K, Chen S, Kravitz J, Shamma DA, Fei-Fei L, Bernstein MS (2016) Embracing error to enable rapid crowdsourcing. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 3167–3179
Kulis B (2013) Metric learning: A survey Foundations and Trends® in Machine Learning 5:287–364
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Lee K-C, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27:684–698
Li F-F, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Conference on computer vision and pattern recognition workshop, 27–02, 2004. pp 178–178. doi: 10.1109/CVPR.2004.109
Li J, Xu C, Yang W, Sun C, Tao D (2017) Discriminative multi-view interactive image re-ranking. IEEE Trans Image Process 26:3113–3127
Lichman M (2013) UCI machine learning repository
Lin L, Wang G, Zuo W, Feng X, Zhang L (2017) Cross-domain visual matching via generalized similarity measure and feature learning. IEEE Trans Pattern Anal Mach Intell 39:1089–1102
Nguyen B, Morell C, Baets BD (2017) Supervised distance metric learning through maximization of the Jeffrey divergence. Pattern Recogn 64:215–225. https://doi.org/10.1016/j.patcog.2016.11.010
Niu G, Dai B, Yamada M, Sugiyama M (2014) Information-theoretic semi-supervised metric learning via entropy regularization. Neural Comput 26:1717–1762. https://doi.org/10.1162/NECO_a_00614
Rodríguez-Fdez I, Canosa A, Mucientes M, Bugarín A (2015) STAC: a web platform for the comparison of algorithms using statistical tests. In: 2015 IEEE international conference on fuzzy systems (FUZZ-IEEE), 2015. IEEE, pp 1–8
Shapiro A, Wardi Y (1996) Convergence analysis of gradient descent stochastic algorithms. J Optim Theory Appl 91:439–454
Shen C, Kim J, Wang L, Hengel AVD (2012) Positive semidefinite metric learning using boosting-like algorithms. J Mach Learn Res 13:1007–1036
Shi Y, Bellet A, Sha F (2014) Sparse compositional metric learning. In: AAAI. pp 2078–2084
Wang D, Tan X (2014) Robust distance metric learning in the presence of label noise. In: AAAI, 2014. pp 1321–1327
Wang D, Tan X (2018) Robust distance metric learning via Bayesian inference. IEEE Trans Image Process 27:1542–1553
Wang F, Zuo W, Zhang L, Meng D, Zhang D (2015) A kernel classification framework for metric learning. IEEE Trans Neural Netw Learn Syst 26:1950–1962
Wang H, Nie F, Huang H (2014) Robust distance metric learning via simultaneous L1-Norm Minimization and Maximization. In: Jebara T, Xing EP (eds) Proceedings of the 31st international conference on machine Learning (ICML-14), 2014. JMLR Workshop and Conference Proceedings, [Formatter not found: ResolvePDF], pp 1836–1844
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Wu P, Hoi SCH, Zhao P, Miao C, Liu ZY (2016) Online multi-modal distance metric learning with application to image retrieval. IEEE Trans Knowl Data Eng 28:454–467. https://doi.org/10.1109/TKDE.2015.2477296
Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recogn 41:3600–3612. https://doi.org/10.1016/j.patcog.2008.05.018
Xu G, Cao Z, Hu B-G, Principe JC (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recogn 63:139–148
Yang T, Jin R, Jain AK (2010) Learning from noisy side information by generalized maximum entropy model. In: Proceedings of the 27th international conference on machine learning (ICML-10). Citeseer, pp 1199–1206
Yuan T, Deng W, Tang J, Tang Y, Chen B (2019) Signal-to-noise ratio: a robust distance metric for deep metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4815–4824
Zabihzadeh D, Monsefi R, Yazdi HS (2019) Sparse Bayesian approach for metric learning in latent space. Knowl-Based Syst 178:11–24
Zha Z-J, Mei T, Wang M, Wang Z, Hua X-S (2009) Robust distance metric learning with auxiliary knowledge. In: Twenty-first international joint conference on artificial intelligence
Acknowledgements
We would like to acknowledge the Machine Learning Lab. in Engineering Faculty of FUM for their kind and technical support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Rights and permissions
About this article
Cite this article
Al-Obaidi, S.A.R., Zabihzadeh, D. & Hajiabadi, H. Robust metric learning based on the rescaled hinge loss. Int. J. Mach. Learn. & Cyber. 11, 2515–2528 (2020). https://doi.org/10.1007/s13042-020-01137-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-020-01137-z