当前位置:
X-MOL 学术
›
arXiv.cs.CV
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Video Relation Detection with Trajectory-aware Multi-modal Features
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-20 , DOI: arxiv-2101.08165 Wentao Xie, Guanghui Ren, Si Liu
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-20 , DOI: arxiv-2101.08165 Wentao Xie, Guanghui Ren, Si Liu
Video relation detection problem refers to the detection of the relationship
between different objects in videos, such as spatial relationship and action
relationship. In this paper, we present video relation detection with
trajectory-aware multi-modal features to solve this task. Considering the complexity of doing visual relation detection in videos, we
decompose this task into three sub-tasks: object detection, trajectory proposal
and relation prediction. We use the state-of-the-art object detection method to
ensure the accuracy of object trajectory detection and multi-modal feature
representation to help the prediction of relation between objects. Our method
won the first place on the video relation detection task of Video Relation
Understanding Grand Challenge in ACM Multimedia 2020 with 11.74\% mAP, which
surpasses other methods by a large margin.
中文翻译:
具有轨迹感知多模式功能的视频关系检测
视频关系检测问题是指检测视频中不同对象之间的关系,如空间关系和动作关系。在本文中,我们提出了具有轨迹感知多模态特征的视频关系检测来解决此任务。考虑到在视频中进行视觉关系检测的复杂性,我们将该任务分解为三个子任务:对象检测,轨迹建议和关系预测。我们使用最新的物体检测方法来确保物体轨迹检测的准确性,并采用多模式特征表示来帮助预测物体之间的关系。我们的方法以11.74 \%的mAP赢得了ACM多媒体2020视频关系理解大挑战的视频关系检测任务的第一名,
更新日期:2021-01-21
中文翻译:
具有轨迹感知多模式功能的视频关系检测
视频关系检测问题是指检测视频中不同对象之间的关系,如空间关系和动作关系。在本文中,我们提出了具有轨迹感知多模态特征的视频关系检测来解决此任务。考虑到在视频中进行视觉关系检测的复杂性,我们将该任务分解为三个子任务:对象检测,轨迹建议和关系预测。我们使用最新的物体检测方法来确保物体轨迹检测的准确性,并采用多模式特征表示来帮助预测物体之间的关系。我们的方法以11.74 \%的mAP赢得了ACM多媒体2020视频关系理解大挑战的视频关系检测任务的第一名,