当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Video Relation Detection with Trajectory-aware Multi-modal Features
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-20 , DOI: arxiv-2101.08165
Wentao Xie, Guanghui Ren, Si Liu

Video relation detection problem refers to the detection of the relationship between different objects in videos, such as spatial relationship and action relationship. In this paper, we present video relation detection with trajectory-aware multi-modal features to solve this task. Considering the complexity of doing visual relation detection in videos, we decompose this task into three sub-tasks: object detection, trajectory proposal and relation prediction. We use the state-of-the-art object detection method to ensure the accuracy of object trajectory detection and multi-modal feature representation to help the prediction of relation between objects. Our method won the first place on the video relation detection task of Video Relation Understanding Grand Challenge in ACM Multimedia 2020 with 11.74\% mAP, which surpasses other methods by a large margin.

中文翻译:

具有轨迹感知多模式功能的视频关系检测

视频关系检测问题是指检测视频中不同对象之间的关系,如空间关系和动作关系。在本文中,我们提出了具有轨迹感知多模态特征的视频关系检测来解决此任务。考虑到在视频中进行视觉关系检测的复杂性,我们将该任务分解为三个子任务:对象检测,轨迹建议和关系预测。我们使用最新的物体检测方法来确保物体轨迹检测的准确性,并采用多模式特征表示来帮助预测物体之间的关系。我们的方法以11.74 \%的mAP赢得了ACM多媒体2020视频关系理解大挑战的视频关系检测任务的第一名,
更新日期:2021-01-21
down
wechat
bug