Subgraph and object context-masked network for scene graph generation,IET Computer Vision

当前位置： X-MOL 学术 › IET Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Subgraph and object context-masked network for scene graph generation
IET Computer Vision ( IF 1.5 ) Pub Date : 2020-11-16 , DOI: 10.1049/iet-cvi.2019.0896
Zhenxing Zheng _{1,

2} , Zhendong Li _{1,

2} , Gaoyun An _{1,

2} , Songhe Feng ₃

Affiliation

Scene graph generation is to recognise objects and their semantic relationships in an image and can help computers understand visual scene. To improve relationship prediction, geometry information is essential and usually incorporated into relationship features. Existing methods use coordinates of objects to encode their spatial layout. However, in this way, they neglect the context of objects. In this study, to take full use of spatial knowledge efficiently, the authors propose a novel subgraph and object context-masked network (SOCNet) consisting of spatial mask relation inference (SMRI) and hierarchical message passing (HMP) modules to address the scene graph generation task. In particular, to take advantage of spatial knowledge, SMRI masks partial context of object features depending on their spatial layout of objects and corresponding subgraph to facilitate their relationship recognition. To refine the features of objects and subgraphs, they also propose HMP that passes highly correlated messages from both microcosmic and macroscopic aspects through a triple-path structure including subgraph–subgraph, object–object, and subgraph–object paths. Finally, statistical co-occurrence probability is used to regularise relationship prediction. SOCNet integrates HMP and SMRI into a unified network, and comprehensive experiments on visual relationship detection and visual genome datasets indicate that SOCNet outperforms several state-of-the-art methods on two common tasks.

中文翻译：

用于场景图生成的子图和对象上下文屏蔽网络

场景图生成旨在识别图像中的对象及其语义关系，并可以帮助计算机理解视觉场景。为了改善关系预测，几何信息是必不可少的，通常包含在关系特征中。现有方法使用对象的坐标来编码其空间布局。但是，以这种方式，它们忽略了对象的上下文。在这项研究中，为了有效地充分利用空间知识，作者提出了一种新颖的子图和对象上下文遮罩网络（SOCNet），该网络由空间遮罩关系推断（SMRI）和分层消息传递（HMP）模块组成，以处理场景图。生成任务。特别是要利用空间知识，SMRI会根据对象特征的空间布局以及相应的子图来掩盖对象特征的部分上下文，以便于它们之间的关系识别。为了完善对象和子图的特征，他们还提出了HMP，它通过微观路径（包括子图-子图，对象-对象和子图-对象路径）传递来自微观和宏观方面的高度相关的消息。最后，使用统计共现概率来规范化关系预测。SOCNet将HMP和SMRI集成到一个统一的网络中，视觉关系检测和视觉基因组数据集的综合实验表明，SOCNet在两项常见任务上的性能优于几种最新方法。他们还提出了HMP，它通过微观路径（包括子图-子图，对象-对象和子图-对象路径）从微观和宏观两个方面传递高度相关的消息。最后，统计共现概率用于规范关系预测。SOCNet将HMP和SMRI集成到一个统一的网络中，视觉关系检测和视觉基因组数据集的综合实验表明，SOCNet在两项常见任务上的性能优于几种最新方法。他们还提出了HMP，它通过微观路径（包括子图-子图，对象-对象和子图-对象路径）从微观和宏观两个方面传递高度相关的消息。最后，使用统计共现概率来规范化关系预测。SOCNet将HMP和SMRI集成到一个统一的网络中，视觉关系检测和视觉基因组数据集的综合实验表明，SOCNet在两项常见任务上的性能优于几种最新方法。

更新日期：2020-11-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11