当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Object-Level Scene Context Prediction
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2021-04-27 , DOI: 10.1109/tpami.2021.3075676
Xiaotian Qiao , Quanlong Zheng , Ying Cao , Rynson W.H. Lau

Contextual information plays an important role in solving various image and scene understanding tasks. Prior works have focused on the extraction of contextual information from an image and use it to infer the properties of some object(s) in the image or understand the scene behind the image, e.g., context-based object detection, recognition and semantic segmentation. In this paper, we consider an inverse problem, i.e., how to hallucinate the missing contextual information from the properties of standalone objects. We refer to it as object-level scene context prediction. This problem is difficult, as it requires extensive knowledge of the complex and diverse relationships among objects in the scene. We propose a deep neural network, which takes as input the properties (i.e., category, shape, and position) of a few standalone objects to predict an object-level scene layout that compactly encodes the semantics and structure of the scene context where the given objects are. Quantitative experiments and user studies demonstrate that our model can generate more plausible scene contexts than the baselines. Our model also enables the synthesis of realistic scene images from partial scene layouts. Finally, we validate that our model internally learns useful features for scene recognition and fake scene detection.

中文翻译:

对象级场景上下文预测

上下文信息在解决各种图像和场景理解任务中起着重要作用。先前的工作主要集中在从图像中提取上下文信息并使用它来推断图像中某些对象的属性或理解图像背后的场景,例如基于上下文的对象检测、识别和语义分割。在本文中,我们考虑了一个逆问题,即如何从独立对象的属性中产生幻觉缺失的上下文信息。我们将其称为对象级场景上下文预测。这个问题很困难,因为它需要广泛了解场景中对象之间复杂多样的关系。我们提出了一个深度神经网络,它将属性(即类别、形状、和位置)来预测对象级场景布局,该布局紧凑地编码给定对象所在的场景上下文的语义和结构。定量实验和用户研究表明,我们的模型可以生成比基线更合理的场景上下文。我们的模型还可以从部分场景布局中合成逼真的场景图像。最后,我们验证我们的模型在内部学习了用于场景识别和虚假场景检测的有用特征。我们的模型还可以从部分场景布局中合成逼真的场景图像。最后,我们验证我们的模型在内部学习了用于场景识别和虚假场景检测的有用特征。我们的模型还可以从部分场景布局中合成逼真的场景图像。最后,我们验证我们的模型在内部学习了用于场景识别和虚假场景检测的有用特征。
更新日期:2021-04-27
down
wechat
bug