当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On integration of multiple features for human activity recognition in video sequences
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2021-07-31 , DOI: 10.1007/s11042-021-11207-1
Arati Kushwaha 1 , Ashish Khare 1 , Prashant Srivastava 2
Affiliation  

Human activity recognition has become one of the most active areas of research in computer vision, due to its increasing demand in many automated monitoring applications such as visual surveillance, human-computer interaction, health care, security systems, and many more. This work aims to introduce an integrated feature descriptor which combines texture feature and shape feature, at multiple orientations, to construct the efficient and robust feature vector for activity recognition in realistic scenarios. This feature descriptor is an integration of Discrete Wavelet Transform (DWT), multiscale Local Binary Pattern, and Histogram of Oriented Gradients (HOG). HOG descriptor extracts local-oriented histograms of the frame sequences, multiscale LBP gives the complex structural information of the frames and DWT gives the directional information at multiple scales. By exploiting these properties, we have constructed an integrated feature descriptor to construct the feature vector and achieves promising results of activity recognition in realistic videos. Multiclass Support Vector Machine (SVM) classifier with one-vs-one architecture has been used for activity recognition. The experiments are performed on five benchmark publicly available video datasets, namely Weizmann, IXMAS, UT Interaction, HMDB51, and UCF101. The experimental results are compared with the results of other state-of-art methods based on conventional machine learning and deep learning-based methods to show the effectiveness and usefulness of the proposed work. The experimental results have demonstrated that the proposed method performs better than the other state-of-art methods.



中文翻译:

视频序列中人类活动识别的多特征融合

人类活动识别已成为计算机视觉中最活跃的研究领域之一,因为它在许多自动化监控应用(例如视觉监控、人机交互、医疗保健、安全系统等)中的需求不断增加。这项工作旨在引入一个集成的特征描述符,它在多个方向上结合了纹理特征和形状特征,以构建有效且稳健的特征向量,用于现实场景中的活动识别。此特征描述符是离散小波变换 (DWT)、多尺度局部二进制模式和定向梯度直方图 (HOG) 的集成。HOG 描述符提取帧序列的面向局部的直方图,多尺度 LBP 给出了框架的复杂结构信息,而 DWT 给出了多尺度的方向信息。通过利用这些特性,我们构建了一个集成的特征描述符来构建特征向量,并在现实视频中实现了有希望的活动识别结果。具有一对一架构的多类支持向量机 (SVM) 分类器已用于活动识别。实验在五个基准公开可用的视频数据集上进行,即 Weizmann、IXMAS、UT Interaction、HMDB51 和 UCF101。将实验结果与基于传统机器学习和基于深度学习的方法的其他最先进方法的结果进行比较,以显示所提出工作的有效性和实用性。

更新日期:2021-08-01
down
wechat
bug