Multi-stream Network for Human-object Interaction Detection,International Journal of Pattern Recognition and Artificial Intelligence

当前位置： X-MOL 学术 › Int. J. Pattern Recognit. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-stream Network for Human-object Interaction Detection
International Journal of Pattern Recognition and Artificial Intelligence ( IF 0.9 ) Pub Date : 2021-03-12 , DOI: 10.1142/s0218001421500257
Chang Wang ₁ , Jinyu Sun ₁ , Shiwei Ma ₁ , Yuqiu Lu ₁ , Wang Liu ₁

Affiliation

Detecting the interaction between humans and objects in images is a critical problem for obtaining a deeper understanding of the visual relationship in a scene and also a critical technology in many practical applications, such as augmented reality, video surveillance and information retrieval. Be that as it may, due to the fine-grained actions and objects in the real scene and the coexistence of multiple interactions in one scene, the problem is far from being solved. This paper differs from prior approaches, which focused only on the features of instances, by proposing a method that utilizes a four-stream CNNs network for human-object interaction (HOI) detection. More detailed visual features, spatial features and pose features from human-object pairs are extracted to solve the challenging task of detection in images. Specially, the core idea is that the region where people interact with objects contains important identifying cues for specific action classes, and the detailed cues can be fused to facilitate HOI recognition. Experiments on two large-scale HOI public benchmarks, V-COCO and HICO-DET, are carried out and the results show the effectiveness of the proposed method.

中文翻译：

用于人-物交互检测的多流网络

检测图像中人与物体之间的交互是深入了解场景中视觉关系的关键问题，也是增强现实、视频监控和信息检索等许多实际应用中的关键技术。尽管如此，由于现实场景中动作和对象的细粒度以及一个场景中多种交互的共存，问题还远远没有得到解决。本文通过提出一种利用四流 CNN 网络进行人-物交互 (HOI) 检测的方法，与之前仅关注实例特征的方法不同。从人-物对中提取更详细的视觉特征、空间特征和姿势特征，以解决图像检测的挑战性任务。特别，核心思想是人与物体交互的区域包含特定动作类的重要识别线索，并且可以融合详细线索以促进HOI识别。在两个大型 HOI 公共基准 V-COCO 和 HICO-DET 上进行了实验，结果表明了所提出方法的有效性。

更新日期：2021-03-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11