当前位置: X-MOL 学术Neural Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization
Neural Computation ( IF 2.9 ) Pub Date : 2021-02-23 , DOI: 10.1162/neco_a_01373
Takuya Shimada 1 , Han Bao 1 , Issei Sato 1 , Masashi Sugiyama 2
Affiliation  

Pairwise similarities and dissimilarities between data points are often obtained more easily than full labels of data in real-world classification problems. To make use of such pairwise information, an empirical risk minimization approach has been proposed, where an unbiased estimator of the classification risk is computed from only pairwise similarities and unlabeled data. However, this approach has not yet been able to handle pairwise dissimilarities. Semisupervised clustering methods can incorporate both similarities and dissimilarities into their framework; however, they typically require strong geometrical assumptions on the data distribution such as the manifold assumption, which may cause severe performance deterioration. In this letter, we derive an unbiased estimator of the classification risk based on all of similarities and dissimilarities and unlabeled data. We theoretically establish an estimation error bound and experimentally demonstrate the practical usefulness of our empirical risk minimization method.



中文翻译:

通过经验风险最小化从成对相似性/差异性和未标记数据中进行分类

在现实世界的分类问题中,数据点之间的成对相似性和不相似性通常比数据的完整标签更容易获得。为了利用这种成对信息,已经提出了一种经验风险最小化方法,其中分类风险的无偏估计量仅根据成对相似性和未标记数据计算得出。然而,这种方法还不能处理成对的差异。半监督聚类方法可以将相似性和不同性结合到它们的框架中;然而,它们通常需要对数据分布进行强几何假设,例如流形假设,这可能会导致严重的性能下降。在这封信中,我们基于所有相似性和不同性以及未标记的数据推导出分类风险的无偏估计量。我们在理论上建立了估计误差界限,并通过实验证明了我们的经验风险最小化方法的实际用途。

更新日期:2021-02-23
down
wechat
bug