当前位置: X-MOL 学术J. Visual Commun. Image Represent. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Human action recognition based on convolutional neural network and spatial pyramid representation
Journal of Visual Communication and Image Representation ( IF 2.6 ) Pub Date : 2019-11-25 , DOI: 10.1016/j.jvcir.2019.102722
Jihai Xiao , Xiaohong Cui , Feng Li

Detecting and recognizing human action in natural scenarios, such as indoor and outdoor, is a significant technique in computer vision and intelligent systems, which is widely applied in video surveillance, pedestrian tracking and human-computer interaction. Conventional approaches have been proposed based on various features and achieved impressive performance. However, these methods failed to cope with partial occlusion and changes of posture. In order to address these limitations, we propose a novel human action recognition method. More specifically, in order to capture image spatial composition, we leverage a three-level spatial pyramid feature extraction scheme, where each pyramid is encoded by local features. Thereafter, regions generated by a proposal algorithm are fed into a dual-aggregation net for deep representation extraction. Afterwards, both local features and deep features are fused to describe each image. To describe human action category, we design a metric CXQDA based on Cosine measure and Cross-view Quadratic Discriminant Analysis (XQDA) to calculate the similarity among different action categories. Experimental results demonstrate that our proposed method can effectively cope with object scale variations, partial occlusion and achieve competitive performance.



中文翻译:

基于卷积神经网络和空间金字塔表示的人体动作识别

在室内和室外等自然场景中检测和识别人类动作是计算机视觉和智能系统中的一项重要技术,已广泛应用于视频监视,行人跟踪和人机交互。已经提出了基于各种特征的常规方法并且获得了令人印象深刻的性能。但是,这些方法不能解决部分闭塞和姿势变化的问题。为了解决这些局限性,我们提出了一种新颖的人类动作识别方法。更具体地说,为了捕获图像的空间组成,我们利用了三级空间金字塔特征提取方案,其中每个金字塔都由局部特征编码。此后,将由提案算法生成的区域馈入双重聚合网络中以进行深度表示提取。然后,融合局部特征和深度特征来描述每个图像。为了描述人类行为类别,我们基于余弦测度和交叉视图二次判别分析(XQDA)设计度量CXQDA,以计算不同行为类别之间的相似度。实验结果表明,我们提出的方法可以有效地应对物体尺度变化,部分遮挡并获得竞争性能。

更新日期:2019-11-25
down
wechat
bug