当前位置: X-MOL 学术J. Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet
Journal of Sensors ( IF 1.4 ) Pub Date : 2021-01-07 , DOI: 10.1155/2021/8864870
Yun Liu 1 , Ruidi Ma 1 , Hui Li 1 , Chuanxu Wang 1 , Ye Tao 1
Affiliation  

Action recognition is an important research direction of computer vision, whose performance based on video images is easily affected by factors such as background and light, while deep video images can better reduce interference and improve recognition accuracy. Therefore, this paper makes full use of video and deep skeleton data and proposes an RGB-D action recognition based two-stream network (SV-GCN), which can be described as a two-stream architecture that works with two different data. Proposed Nonlocal-stgcn (S-Stream) based on skeleton data, by adding nonlocal to obtain dependency relationship between a wider range of joints, to provide more rich skeleton point features for the model, proposed a video based Dilated-slowfastnet (V-Stream), which replaces traditional random sampling layer with dilated convolutional layers, which can make better use of depth the feature; finally, two stream information is fused to realize action recognition. The experimental results on NTU-RGB+D dataset show that proposed method significantly improves recognition accuracy and is superior to st-gcn and Slowfastnet in both CS and CV.

中文翻译:

使用双流ConvNet进行深度特征增强和融合的RGB-D人体动作识别

动作识别是计算机视觉的重要研究方向,其基于视频图像的性能容易受到背景和光线等因素的影响,而深层视频图像可以更好地减少干扰,提高识别精度。因此,本文充分利用了视频和深层骨架数据,并提出了一种基于RGB-D动作识别的两流网络(SV-GCN),可以将其描述为可处理两种不同数据的两流体系结构。提出了基于骨架数据的Nonlocal-stgcn(S-Stream),通过添加非局部来获得更大范围的关节之间的依赖关系,为模型提供更丰富的骨架点特征,提出了一种基于视频的Dilow-slowfastnet(V-Stream) ),它用膨胀的卷积层代替了传统的随机采样层,可以更好地利用深度功能;最后,融合两个流信息以实现动作识别。在NTU-RGB + D数据集上的实验结果表明,所提出的方法在CS和CV方面均显着提高了识别精度,并且优于st-gcn和Slowfastnet。
更新日期:2021-01-07
down
wechat
bug