当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fine-Grained Instance-Level Sketch-Based Video Retrieval
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 2020-08-06 , DOI: 10.1109/tcsvt.2020.3014491
Peng Xu , Kun Liu , Tao Xiang , Timothy M. Hospedales , Zhanyu Ma , Jun Guo , Yi-Zhe Song

Existing sketch-analysis work studies sketches depicting static objects or scenes. In this work, we propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR), where a sketch sequence is used as a query to retrieve a specific target video instance. Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level. We contribute the first FG-SBVR dataset with rich annotations. We then introduce a novel multi-stream multi-modality deep network to perform FG-SBVR under both strong and weakly supervised settings. The key component of the network is a relation module, designed to prevent model overfitting given scarce training data. We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.

中文翻译:


基于草图的细粒度实例级视频检索



现有的草图分析工作研究描绘静态物体或场景的草图。在这项工作中,我们提出了一种新颖的基于细粒度实例级草图的视频检索(FG-SBVR)的跨模态检索问题,其中草图序列用作查询来检索特定的目标视频实例。与基于草图的静态图像检索和粗粒度类别级视频检索相比,这更具挑战性,因为视觉外观和运动需要在细粒度级别同时匹配。我们贡献了第一个带有丰富注释的 FG-SBVR 数据集。然后,我们引入了一种新颖的多流多模态深度网络,可以在强监督和弱监督设置下执行 FG-SBVR。网络的关键组件是关系模块,旨在防止模型在训练数据稀缺的情况下过度拟合。我们表明,该模型显着优于许多现有的专为视频分析而设计的最先进模型。
更新日期:2020-08-06
down
wechat
bug