当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Ranking for Image Zero-Shot Multi-Label Classification.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-05-14 , DOI: 10.1109/tip.2020.2991527
Zhong Ji , Biying Cui , Huihui Li , Yu-Gang Jiang , Tao Xiang , Timothy Hospedales , Yanwei Fu

During the past decade, both multi-label learning and zero-shot learning have attracted huge research attention, and significant progress has been made. Multi-label learning algorithms aim to predict multiple labels given one instance, while most existing zero-shot learning approaches target at predicting a single testing label for each unseen class via transferring knowledge from auxiliary seen classes to target unseen classes. However, relatively less effort has been made on predicting multiple labels in the zero-shot setting, which is nevertheless a quite challenging task. In this work, we investigate and formalize a flexible framework consisting of two components, i.e., visual-semantic embedding and zero-shot multi-label prediction. First, we present a deep regression model to project the visual features into the semantic space, which explicitly exploits the correlations in the intermediate semantic layer of word vectors and makes label prediction possible. Then, we formulate the label prediction problem as a pairwise one and employ Ranking SVM to seek the unique multi-label correlations in the embedding space. Furthermore, we provide a transductive multi-label zero-shot prediction approach that exploits the testing data manifold structure. We demonstrate the effectiveness of the proposed approach on three popular multi-label datasets with state-of-the-art performance obtained on both conventional and generalized ZSL settings.

中文翻译:


图像零样本多标签分类的深度排名。



在过去的十年中,多标签学习和零样本学习都引起了巨大的研究关注,并取得了重大进展。多标签学习算法的目标是在给定一个实例的情况下预测多个标签,而大多数现有的零样本学习方法的目标是通过将知识从辅助已见类转移到目标未见类来预测每个未见类的单个测试标签。然而,在零样本设置下预测多个标签方面所做的努力相对较少,但这仍然是一项相当具有挑战性的任务。在这项工作中,我们研究并形式化了一个由两个组件组成的灵活框架,即视觉语义嵌入和零样本多标签预测。首先,我们提出了一种深度回归模型,将视觉特征投影到语义空间中,该模型明确地利用了词向量中间语义层的相关性,并使标签预测成为可能。然后,我们将标签预测问题表述为成对问题,并使用排序支持向量机来寻找嵌入空间中唯一的多标签相关性。此外,我们提供了一种利用测试数据流形结构的转导多标签零样本预测方法。我们在三个流行的多标签数据集上证明了所提出的方法的有效性,并且在传统和广义 ZSL 设置上都获得了最先进的性能。
更新日期:2020-07-03
down
wechat
bug