Comparison Knowledge Translation for Generalizable Image Classification,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Comparison Knowledge Translation for Generalizable Image Classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2022-05-07 , DOI: arxiv-2205.03633
Zunlei Feng, Tian Qiu, Sai Wu, Xiaotuan Jin, Zengliang He, Mingli Song, Huiqiong Wang

Deep learning has recently achieved remarkable performance in image classification tasks, which depends heavily on massive annotation. However, the classification mechanism of existing deep learning models seems to contrast to humans' recognition mechanism. With only a glance at an image of the object even unknown type, humans can quickly and precisely find other same category objects from massive images, which benefits from daily recognition of various objects. In this paper, we attempt to build a generalizable framework that emulates the humans' recognition mechanism in the image classification task, hoping to improve the classification performance on unseen categories with the support of annotations of other categories. Specifically, we investigate a new task termed Comparison Knowledge Translation (CKT). Given a set of fully labeled categories, CKT aims to translate the comparison knowledge learned from the labeled categories to a set of novel categories. To this end, we put forward a Comparison Classification Translation Network (CCT-Net), which comprises a comparison classifier and a matching discriminator. The comparison classifier is devised to classify whether two images belong to the same category or not, while the matching discriminator works together in an adversarial manner to ensure whether classified results match the truth. Exhaustive experiments show that CCT-Net achieves surprising generalization ability on unseen categories and SOTA performance on target categories.

中文翻译：

通用图像分类的比较知识翻译

深度学习最近在严重依赖海量标注的图像分类任务中取得了显着的表现。然而，现有深度学习模型的分类机制似乎与人类的识别机制形成对比。人类只需看一眼甚至未知类型的物体的图像，就可以快速、准确地从海量图像中找到其他同类物体，这得益于日常对各种物体的识别。在本文中，我们尝试构建一个可泛化的框架，模拟人类在图像分类任务中的识别机制，希望在其他类别注释的支持下提高对未见类别的分类性能。具体来说，我们研究了一项名为比较知识翻译 (CKT) 的新任务。给定一组完全标记的类别，CKT 旨在将从标记类别中学到的比较知识转化为一组新类别。为此，我们提出了一个比较分类翻译网络（CCT-Net），它包括一个比较分类器和一个匹配鉴别器。比较分类器用于对两张图像是否属于同一类别进行分类，而匹配鉴别器以对抗的方式协同工作，以确保分类结果是否与事实相符。详尽的实验表明，CCT-Net 在看不见的类别上实现了惊人的泛化能力，在目标类别上实现了 SOTA 性能。我们提出了一个比较分类翻译网络（CCT-Net），它包括一个比较分类器和一个匹配鉴别器。比较分类器用于对两张图像是否属于同一类别进行分类，而匹配鉴别器以对抗的方式协同工作，以确保分类结果是否与事实相符。详尽的实验表明，CCT-Net 在看不见的类别上实现了惊人的泛化能力，在目标类别上实现了 SOTA 性能。我们提出了一个比较分类翻译网络（CCT-Net），它包括一个比较分类器和一个匹配鉴别器。比较分类器用于对两张图像是否属于同一类别进行分类，而匹配鉴别器以对抗的方式协同工作，以确保分类结果是否与事实相符。详尽的实验表明，CCT-Net 在看不见的类别上实现了惊人的泛化能力，在目标类别上实现了 SOTA 性能。而匹配鉴别器则以对抗的方式协同工作，以确保分类结果是否与事实相符。详尽的实验表明，CCT-Net 在看不见的类别上实现了惊人的泛化能力，在目标类别上实现了 SOTA 性能。而匹配鉴别器则以对抗的方式协同工作，以确保分类结果是否与事实相符。详尽的实验表明，CCT-Net 在看不见的类别上实现了惊人的泛化能力，在目标类别上实现了 SOTA 性能。

更新日期：2022-05-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文