当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Video sketch: A middle-level representation for action recognition
Applied Intelligence ( IF 5.3 ) Pub Date : 2020-11-06 , DOI: 10.1007/s10489-020-01905-y
Xing-Yuan Zhang , Ya-Ping Huang , Yang Mi , Yan-Ting Pei , Qi Zou , Song Wang

Different modalities extracted from videos, such as RGB and optical flows, may provide complementary cues for improving video action recognition. In this paper, we introduce a new modality named video sketch, which implies the human shape information, as a complementary modality for video action representation. We show that video action recognition can be enhanced by using the proposed video sketch. More specifically, we first generate video sketch with class distinctive action areas and then employ a two-stream network to combine the shape information extracted from image-based sketch and point-based sketch, followed by fusing the classification scores of two streams to generate shape representation for videos. Finally, we use the shape representation as the complementary one for the traditional appearance (RGB) and motion (optical flow) representations for the final video classification. We conduct extensive experiments on four human action recognition datasets – KTH, HMDB51, UCF101, Something-Something and UTI. The experimental results show that the proposed method outperforms the existing state-of-the-art action recognition methods.



中文翻译:

视频草图:用于动作识别的中间层表示

从视频中提取的不同模式(例如RGB和光流)可以提供补充提示,以改善视频动作识别。在本文中,我们介绍了一种称为视频素描的新形式,它暗示着人的形状信息,作为视频动作表示的一种补充形式。我们表明,可以通过使用建议的视频草图来增强视频动作识别。更具体地说,我们首先生成具有类别独特动作区域的视频草图,然后使用两流网络来组合从基于图像的草图和基于点的草图中提取的形状信息,然后将两个流的分类分数融合以生成形状视频的表示形式。最后,我们将形状表示法用作传统外观(RGB)的补充,而将运动(光流)表示法用于最终的视频分类。我们对四个人类动作识别数据集进行了广泛的实验-KTH,HMDB51,UCF101,Something-Something和UTI。实验结果表明,该方法优于现有的最新动作识别方法。

更新日期:2020-11-06
down
wechat
bug