当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning label correlations for multi-label image recognition with graph networks
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-07-31 , DOI: 10.1016/j.patrec.2020.07.040
Qing Li , Xiaojiang Peng , Yu Qiao , Qiang Peng

Multi-label image recognition is a task that predicts a set of object labels in an image. As the objects co-occur in the physical world, it is desirable to model label dependencies. Previous existing methods resort to either recurrent networks or pre-defined label correlation graphs for this purpose. In this paper, instead of using a pre-defined graph which is inflexible and may be sub-optimal for multi-label classification, we propose the A-GCN, which leverages the popular Graph Convolutional Networks with an Adaptive label correlation graph to model label dependencies. Specifically, we introduce a plug-and-play Label Graph (LG) module to learn label correlations with word embeddings, and then utilize traditional GCN to map this graph into label-dependent object classifiers which are further applied to image features. The basic LG module incorporates two 1 × 1 convolutional layers and uses the dot product to generate label graphs. In addition, we propose a sparse correlation constraint to enhance the LG module, and also explore different LG architectures. We validate our method on two diverse multi-label datasets: MS-COCO and Fashion550K. Experimental results show that our A-GCN significantly improves baseline methods and achieves performance superior or comparable to the state of the art.



中文翻译:

学习标签相关性以利用图网络进行多标签图像识别

多标签图像识别是一项预测图像中的一组对象标签的任务。由于对象在物理世界中同时出现,因此需要对标签依赖性进行建模。为此,先前的现有方法求助于循环网络或预定义的标签相关图。在本文中,而不是使用预先定义的曲线图,其是不灵活的,并且可以是次优的多标签分类,我们提出A-GCN,它利用流行ģ拍摄和Ç onvolutional Ñ与etworks适应性标签相关图可以对标签依赖性进行建模。具体来说,我们引入了一个即插即用的标签图(LG)模块,以学习与词嵌入的标签相关性,然后利用传统的GCN将该图映射到依赖标签的对象分类器中,这些分类器将进一步应用于图像特征。基本的LG模块包含两个1×1卷积层,并使用点积生成标签图。此外,我们提出了一种稀疏的相关约束来增强LG模块,并探索了不同的LG体系结构。我们在两个不同的多标签数据集上验证了我们的方法:MS-COCO和Fashion550K。实验结果表明,我们的A-GCN显着改善了基线方法,并实现了优于或可比的性能。

更新日期:2020-08-20
down
wechat
bug