当前位置: X-MOL 学术Int. J. Mach. Learn. & Cyber. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Partial label metric learning by collapsing classes
International Journal of Machine Learning and Cybernetics ( IF 3.1 ) Pub Date : 2020-05-06 , DOI: 10.1007/s13042-020-01129-z
Shuang Xu , Min Yang , Yu Zhou , Ruirui Zheng , Wenpeng Liu , Jianjun He

Partial label learning (PLL) is a weakly supervised learning framework proposed recently, in which the ground-truth label of training sample is not precisely annotated but concealed in a set of candidate labels, which makes the accuracy of the existing PLL algorithms is usually lower than that of the traditional supervised learning algorithms. Since the accuracy of a learning algorithm is usually closely related to its distance metric, the metric learning technologies can be employed to improve the accuracy of the existing PLL algorithms. However, only a few PLL metric learning algorithms have been proposed up to the present. In view of this, a novel PLL metric learning algorithm is proposed by using the collapsing classes model in this paper. The basic idea is first to take each training sample and its neighbor with shared candidate labels as a similar pair, while each training sample and its neighbor without shared candidate labels as a dissimilar pair, then two probability distributions are defined based on the distance and label similarity of these pairs, respectively, finally the metric matrix is obtained via minimizing the Kullback–Leibler divergence of these two probability distributions. Experimental results on six UCI data sets and four real-world PLL data sets show that the proposed algorithm can obviously improve the accuracy of the existing PLL algorithms.



中文翻译:

通过折叠课程进行部分标签度量学习

局部标签学习(Partial Label Learning,PLL)是最近提出的一种弱监督学习框架,在该框架中,训练样本的地面标签没有被精确标注,而是隐藏在一组候选标签中,这使得现有PLL算法的准确性通常较低。比传统的监督学习算法要强。由于学习算法的准确性通常与其距离度量密切相关,因此可以采用度量学习技术来提高现有PLL算法的准确性。但是,到目前为止,仅提出了几种PLL度量学习算法。有鉴于此,本文提出了一种基于折叠类模型的PLL度量学习算法。基本思想是首先将每个训练样本及其带有共享候选标签的邻居作为一个相似对,而每个训练样本及其没有共享候选标签的邻居作为一个不相似对,然后根据距离和标签定义两个概率分布。最后,通过最小化这两个概率分布的Kullback-Leibler散度,最终获得度量矩阵。在六个UCI数据集和四个实际PLL数据集上的实验结果表明,该算法可以明显提高现有PLL算法的准确性。分别通过最小化这两个概率分布的Kullback-Leibler散度最终获得度量矩阵。在六个UCI数据集和四个实际PLL数据集上的实验结果表明,该算法可以明显提高现有PLL算法的准确性。分别通过最小化这两个概率分布的Kullback-Leibler散度最终获得度量矩阵。在六个UCI数据集和四个实际PLL数据集上的实验结果表明,该算法可以明显提高现有PLL算法的准确性。

更新日期:2020-05-06
down
wechat
bug