当前位置: X-MOL 学术J. Multimodal User Interfaces › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Spatial-temporal dynamic hand gesture recognition via hybrid deep learning model
Journal on Multimodal User Interfaces ( IF 2.9 ) Pub Date : 2019-05-14 , DOI: 10.1007/s12193-019-00304-z
Jinghua Li , Huarui Huai , Junbin Gao , Dehui Kong , Lichun Wang

Hand gesture is a kind of natural interaction way and hand gesture recognition has recently become more and more popular in human–computer interaction. However, the complexity and variations of hand gesture like various illuminations, views, and self-structural characteristics make the hand gesture recognition still challengeable. How to design an appropriate feature representation and classifier are the core problems. To this end, this paper develops an expressive deep hybrid hand gesture recognition architecture called CNN-MVRBM-NN. The framework consists of three submodels. The CNN submodel automatically extracts frame-level spatial features, and the MVRBM submodel fuses spatial information over time for training higher level semantics inherent in hand gesture, while the NN submodel classifies hand gesture, which is initialized by MVRBM for second order data representation, and then such NN pre-trained by MVRBM can be fine-tuned by back propagation so as to be more discriminative. The experimental results on Cambridge Hand Gesture Data set show the proposed hybrid CNN-MVRBM-NN has obtained the state-of-the-art recognition performance.

中文翻译:

基于混合深度学习模型的时空动态手势识别

手势是一种自然的交互方式,手势识别最近在人机交互中变得越来越流行。但是,手势的复杂性和变化性(如各种照明,视图和自身结构特征)使手势识别仍然具有挑战性。如何设计适当的特征表示和分类器是核心问题。为此,本文开发了一种具有表现力的深度混合手势识别架构,称为CNN-MVRBM-NN。该框架包含三个子模型。CNN子模型会自动提取帧级空间特征,而MVRBM子模型会随着时间的流逝融合空间信息,以训练手势固有的高级语义,而NN子模型则对手势进行分类,由MVRBM初始化以用于二阶数据表示,然后可以通过反向传播对通过MVRBM预训练的NN进行微调,以使其更具判别性。在剑桥手势数据集上的实验结果表明,所提出的混合CNN-MVRBM-NN已获得了最新的识别性能。
更新日期:2019-05-14
down
wechat
bug