Metric Learning from Imbalanced Data with Generalization Guarantees,Pattern Recognition Letters

当前位置： X-MOL 学术 › Pattern Recogn. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Metric Learning from Imbalanced Data with Generalization Guarantees
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-03-05 , DOI: 10.1016/j.patrec.2020.03.008
Leo Gautheron , Amaury Habrard , Emilie Morvant , Marc Sebban

Since many machine learning algorithms require a distance metric to capture dis/similarities between data points, metric learning has received much attention during the past decade. Surprisingly, very few methods have focused on learning a metric in an imbalanced scenario where the number of positive examples is much smaller than the negatives, and even fewer derived theoretical guarantees in this setting. Here, we address this difficult task and design a new Mahalanobis metric learning algorithm (IML) which deals with class imbalance. We further prove a generalization bound involving the proportion of positive examples using the uniform stability framework. The empirical study performed on a wide range of datasets shows the efficiency of IML.

中文翻译：

具有泛化保证的不平衡数据的度量学习

由于许多机器学习算法需要距离度量来捕获数据点之间的差异/相似性，因此在过去的十年中，度量学习受到了广泛的关注。出人意料的是，很少有方法专注于在不均衡情况下学习度量的方法，在这种情况下，积极案例的数量远小于消极案例，并且在这种情况下得出的理论保证甚至更少。在这里，我们解决了这一艰巨的任务，并设计了一种新的马哈拉诺比斯度量学习算法（IML），该算法可解决班级失衡问题。我们进一步证明了使用统一稳定性框架涉及正例比例的概化范围。对大量数据集进行的实证研究表明，IML的有效性。

更新日期：2020-03-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11