当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic Hand Gesture Recognition Using Improved Spatio-Temporal Graph Convolutional Network
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 4-5-2022 , DOI: 10.1109/tcsvt.2022.3165069
Jae-Hun Song 1 , Kyeongbo Kong 2 , Suk-Ju Kang 1
Affiliation  

Hand gesture recognition is essential to human-computer interaction as the most natural way of communicating. Furthermore, with the development of 3D hand pose estimation technology and the performance improvement of low-cost depth cameras, skeleton-based dynamic hand gesture recognition has received much attention. This paper proposes a novel multi-stream improved spatio-temporal graph convolutional network (MS-ISTGCN) for skeleton-based dynamic hand gesture recognition. We adopt an adaptive spatial graph convolution that can learn the relationship between distant hand joints and propose an extended temporal graph convolution with multiple dilation rates that can extract informative temporal features from short to long periods. Furthermore, we add a new attention layer consisting of effective spatio-temporal attention and channel attention between the spatial and temporal graph convolution layers to find and focus on key features. Finally, we propose a multi-stream structure that feeds multiple data modalities (i.e., joints, bones, and motions) as inputs to improve performance using the ensemble technique. Each of the three-stream networks is independently trained and fused to predict the final hand gesture. The performance of the proposed method is verified through extensive experiments with two widely used public dynamic hand gesture datasets: SHREC’17 Track and DHG-14/28. Our proposed method achieves the highest recognition accuracy in various gesture categories for both datasets compared with state-of-the-art methods.

中文翻译:


使用改进的时空图卷积网络进行动态手势识别



手势识别作为最自然的交流方式对于人机交互至关重要。此外,随着3D手势估计技术的发展和低成本深度相机性能的提高,基于骨骼的动态手势识别受到了广泛的关注。本文提出了一种新颖的多流改进时空图卷积网络(MS-ISTGCN),用于基于骨架的动态手势识别。我们采用自适应空间图卷积,可以学习远处手关节之间的关系,并提出一种具有多个膨胀率的扩展时间图卷积,可以提取从短到长周期的信息时间特征。此外,我们在时空图卷积层之间添加了一个新的注意力层,该注意力层由有效的时空注意力和通道注意力组成,以查找并关注关键特征。最后,我们提出了一种多流结构,它提供多种数据模态(即关节、骨骼和运动)作为输入,以使用集成技术提高性能。每个三流网络都经过独立训练和融合,以预测最终的手势。通过对两个广泛使用的公共动态手势数据集:SHREC'17 Track 和 DHG-14/28 的大量实验,验证了所提出方法的性能。与最先进的方法相比,我们提出的方法在两个数据集的各种手势类别中实现了最高的识别精度。
更新日期:2024-08-28
down
wechat
bug