当前位置: X-MOL 学术Int. J. Mach. Learn. & Cyber. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards wide-scale continuous gesture recognition model for in-depth and grayscale input videos
International Journal of Machine Learning and Cybernetics ( IF 3.1 ) Pub Date : 2021-01-02 , DOI: 10.1007/s13042-020-01227-y
Rihem Mahmoud , Selma Belgacem , Mohamed Nazih Omri

In recent years, gesture recognition in video sequences has aroused growing interest in the fields of computer vision and behavioral understanding, for example in the control of robots and video games, in the field of video surveillance, automatic video indexing or content-based video retrieval. Processing large-scale continuous gesture data with in-depth, grayscale input videos remains a primary challenge for academic researchers. A wide range of recognition models have been proposed to solve this problem but have not proven their great performance. The main contribution of this article to address this problem is to segment the sequences of continuous gestures into isolated gestures, using the average of the velocity information calculated on the basis of the estimate of the deep optical flow, and to extract a set of relevant descriptors, called characteristics. signature, in order to characterize different intensities and spatial information describing the location, speed and orientation of movement. Finally, to transmit to a linear SVM the characteristics built for the depth and gray scale sequences, for each isolated segment for its classification. The experimental study carried out on the various standard data collections namely KTH, Chalearn and Weizmann, on our model and on the main models that we have studied in the literature, as well as the analysis of the results, which we obtained, clearly show the limits of these studied models and confirms the performance of our model as well as efficiency in terms of precision, recall and robustness.



中文翻译:

面向深度和灰度输入视频的大规模连续手势识别模型

近年来,视频序列中的手势识别引起了人们对计算机视觉和行为理解领域的兴趣,例如在机器人和视频游戏的控制,视频监视,自动视频索引或基于内容的视频检索领域。使用深度,灰度输入视频来处理大规模连续手势数据仍然是学术研究人员面临的主要挑战。已经提出了各种各样的识别模型来解决这个问题,但是还没有证明它们的出色性能。本文针对此问题的主要贡献是,使用基于深度光流估计值计算的速度信息的平均值,将连续手势的序列分割为孤立的手势,并提取一组相关的描述符,称为特征。签名,以表征描述运动的位置,速度和方向的不同强度和空间信息。最后,将针对深度和灰度序列构建的特征传输到线性SVM,针对每个隔离段对其进行分类。在我们的模型和我们在文献中研究的主要模型上,对各种标准数据集(即KTH,Chalearn和Weizmann)进行了实验研究,并对结果进行了分析,清楚地表明了这些研究模型的局限性,从准确性,召回性和鲁棒性方面证实了我们模型的性能以及效率。速度和运动方向。最后,将针对深度和灰度序列构建的特征传输到线性SVM,针对每个隔离段对其进行分类。在我们的模型和我们在文献中研究的主要模型上,对各种标准数据集(即KTH,Chalearn和Weizmann)进行了实验研究,并对结果进行了分析,清楚地表明了这些研究模型的局限性,从准确性,召回性和鲁棒性方面证实了我们模型的性能以及效率。速度和运动方向。最后,将针对深度和灰度序列构建的特征传输到线性SVM,针对每个隔离段对其进行分类。在我们的模型和我们在文献中研究的主要模型上,对各种标准数据集(即KTH,Chalearn和Weizmann)进行了实验研究,并对结果进行了分析,清楚地表明了这些研究模型的局限性,从准确性,召回性和鲁棒性方面证实了我们模型的性能以及效率。

更新日期:2021-01-02
down
wechat
bug