当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Real-time one-shot learning gesture recognition based on lightweight 3D Inception-ResNet with separable convolutions
Pattern Analysis and Applications ( IF 3.9 ) Pub Date : 2021-04-23 , DOI: 10.1007/s10044-021-00965-1
Lianwei Li , Shiyin Qin , Zhi Lu , Dinghao Zhang , Kuanhong Xu , Zhongying Hu

Gesture recognition is a popular research field in computer vision and the application of deep neural networks greatly improves its performance. However, the general deep learning method has a large number of parameters preventing the practical application on resource-limited devices. Meanwhile, collecting large number of training samples is usually time-consuming and difficult. To this end, we propose a lightweight 3D Inception-ResNet to extract discriminative features for real-time one-shot learning gesture recognition which aims to recognize gestures successfully given only one training sample for each new class. For efficient extraction of gesture features, we firstly extend the original 2D Inception-ResNet to the 3D version and then apply two kinds of separable convolutions as well as some other design strategies to reduce the number of parameters and computation complexity making it running in real-time even on CPU for feature extraction. Moreover, the consumption of storage space is also greatly reduced. In order to obtain robust performance for one-shot learning recognition, we employ an evolution mechanism by updating the root sample with innovation of new samples to enhance and improve the performance of the nearest neighbor classifier. Meanwhile, we propose an update strategy of the dynamic threshold to deal with the problem of threshold selection in real-world applications. In order to improve the robustness of recognition performance, we conduct artificial data synthesis to augment our collected dataset. A series of experiments conducted on public datasets and our collected dataset demonstrate the effectiveness of our approach to one-shot learning gesture recognition.



中文翻译:

基于可分离卷积的轻量级3D Inception-ResNet的实时一发式学习手势识别

手势识别是计算机视觉领域的热门研究领域,深度神经网络的应用极大地提高了其性能。但是,一般的深度学习方法具有大量参数,从而无法在资源受限的设备上进行实际应用。同时,收集大量的训练样本通常是耗时且困难的。为此,我们提出了一种轻量级的3D Inception-ResNet,以提取用于实时单发学习手势识别的判别特征,该功能旨在在每个新班级仅给定一个训练样本的情况下成功地识别手势。为了有效提取手势特征,我们首先将原始的2D Inception-ResNet扩展到3D版本,然后应用两种可分离的卷积以及其他一些设计策略来减少参数数量和计算复杂度,从而甚至在CPU上实时运行以进行特征提取。而且,存储空间的消耗也大大减少了。为了获得针对一次性学习识别的鲁棒性能,我们采用了一种进化机制,通过使用新样本的创新来更新根样本,以增强和改进最近邻分类器的性能。同时,我们提出了动态阈值的更新策略,以解决实际应用中的阈值选择问题。为了提高识别性能的鲁棒性,我们进行人工数据合成以扩大我们收集的数据集。在公共数据集和我们收集的数据集上进行的一系列实验证明了我们的单次学习手势识别方法的有效性。

更新日期:2021-04-23
down
wechat
bug