当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A new method for positive and unlabeled learning with privileged information
Applied Intelligence ( IF 3.4 ) Pub Date : 2021-06-12 , DOI: 10.1007/s10489-021-02528-7
Bo Liu , Qian Liu , Yanshan Xiao

Positive and unlabeled learning (PU learning) has been studied to address the situation in which only positive and unlabeled examples are available. Most of the previous work has been devoted to identifying negative examples from the unlabeled data, so that the supervised learning approaches can be applied to build a classifier. However, for the remaining unlabeled data, they either exclude them from the learning phase or force them to belong to a class, and this always limits the performance of PU learning. In addition, previous PU methods assume the training data and the testing data have the same features representations. However, we can always collect the features that the training data have while the test data do not have, these kinds of features are called privileged information. In this paper, we propose a new method, which is based on similarity approach for the problem of positive and unlabeled learning with privileged information (SPUPIL), which consists of two steps. The proposed SPUPIL method first conducts KNN method to generate the similarity weights and then the similarity weights and privileged information are incorporated to the learning model based on Ranking SVM to build a more accurate classifier. We also use the Lagrangian method to transform the original model into its dual problem, and solve it to obtain the classifier. Extensive experiments on the real data sets show that the performance of the SPUPIL is better than the state-of-the-art PU learning methods.



中文翻译:

一种利用特权信息进行积极和无标签学习的新方法

已经研究了正例和未标记学习(PU 学习)以解决只有正例和未标记示例可用的情况。以前的大部分工作都致力于从未标记的数据中识别负面示例,以便可以应用监督学习方法来构建分类器。然而,对于剩余的未标记数据,他们要么将它们排除在学习阶段之外,要么强制它们属于一个类,这总是限制了 PU 学习的性能。此外,之前的 PU 方法假设训练数据和测试数据具有相同的特征表示。但是,我们总是可以收集训练数据有而测试数据没有的特征,这些特征被称为特权信息。在本文中,我们提出了一种新方法,它基于相似性方法解决具有特权信息的正面和未标记学习问题(SPUPIL),它包括两个步骤。所提出的 SPUPIL 方法首先进行 KNN 方法来生成相似性权重,然后将相似性权重和特权信息纳入基于 Ranking SVM 的学习模型中,以构建更准确的分类器。我们也使用拉格朗日方法将原始模型转化为它的对偶问题,求解得到分类器。在真实数据集上的大量实验表明,SPUPIL 的性能优于最先进的 PU 学习方法。所提出的 SPUPIL 方法首先进行 KNN 方法来生成相似性权重,然后将相似性权重和特权信息纳入基于 Ranking SVM 的学习模型中,以构建更准确的分类器。我们也使用拉格朗日方法将原始模型转化为它的对偶问题,求解得到分类器。在真实数据集上的大量实验表明,SPUPIL 的性能优于最先进的 PU 学习方法。所提出的 SPUPIL 方法首先进行 KNN 方法来生成相似性权重,然后将相似性权重和特权信息纳入基于 Ranking SVM 的学习模型中,以构建更准确的分类器。我们也使用拉格朗日方法将原始模型转化为它的对偶问题,求解得到分类器。在真实数据集上的大量实验表明,SPUPIL 的性能优于最先进的 PU 学习方法。

更新日期:2021-06-13
down
wechat
bug