Fine-Grained Instance-Level Sketch-Based Video Retrieval,IEEE Transactions on Circuits and Systems for Video Technology

当前位置： X-MOL 学术 › IEEE Trans. Circ. Syst. Video Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fine-Grained Instance-Level Sketch-Based Video Retrieval
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 2020-08-06 , DOI: 10.1109/tcsvt.2020.3014491
Peng Xu , Kun Liu , Tao Xiang , Timothy M. Hospedales , Zhanyu Ma , Jun Guo , Yi-Zhe Song

Existing sketch-analysis work studies sketches depicting static objects or scenes. In this work, we propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR), where a sketch sequence is used as a query to retrieve a specific target video instance. Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level. We contribute the first FG-SBVR dataset with rich annotations. We then introduce a novel multi-stream multi-modality deep network to perform FG-SBVR under both strong and weakly supervised settings. The key component of the network is a relation module, designed to prevent model overfitting given scarce training data. We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.

中文翻译：

基于草图的细粒度实例级视频检索

现有的草图分析工作研究描绘静态物体或场景的草图。在这项工作中，我们提出了一种新颖的基于细粒度实例级草图的视频检索（FG-SBVR）的跨模态检索问题，其中草图序列用作查询来检索特定的目标视频实例。与基于草图的静态图像检索和粗粒度类别级视频检索相比，这更具挑战性，因为视觉外观和运动需要在细粒度级别同时匹配。我们贡献了第一个带有丰富注释的 FG-SBVR 数据集。然后，我们引入了一种新颖的多流多模态深度网络，可以在强监督和弱监督设置下执行 FG-SBVR。网络的关键组件是关系模块，旨在防止模型在训练数据稀缺的情况下过度拟合。我们表明，该模型显着优于许多现有的专为视频分析而设计的最先进模型。

更新日期：2020-08-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11