Real-time crowd behavior recognition in surveillance videos based on deep learning methods,Journal of Real-Time Image Processing

当前位置： X-MOL 学术 › J. Real-Time Image Proc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Real-time crowd behavior recognition in surveillance videos based on deep learning methods
Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2021-05-03 , DOI: 10.1007/s11554-021-01116-9
Fariba Rezaei , Mehran Yazdi

Automatic video surveillance in public crowded places has been an active research area for security purposes. Traditional approaches try to solve the crowd behavior recognition task using a sequential two-stage pipeline as low-level feature extraction and classification. Lately, deep learning has shown promising results in comparison to traditional methods by extracting high-level representation and solving the problem in an end-to-end pipeline. In this paper, we investigate a deep architecture for crowd event recognition to detect seven behavior categories in PETS2009 event recognition dataset. More especially, we apply an integrated handcrafted and Conv-LSTM-AE method with optical flow images as input to extract a high-level representation of data and conduct classification. After achieving a latent representation of input optical flow image sequences in the bottleneck of autoencoder(AE), the architecture is split into two separate branches, one as AE decoder and the other as the classifier. The proposed architecture is jointly trained for representation and classification by defining two different losses. The experimental results in comparison to the state-of-the-art methods demonstrate that our algorithm can be promising for real-time event recognition and achieves a better performance in calculated metrics.

中文翻译：

基于深度学习方法的监控视频实时人群行为识别

为了安全起见，在公共人群拥挤的地方进行自动视频监视一直是活跃的研究领域。传统方法试图使用顺序的两阶段流水线作为低级特征提取和分类来解决人群行为识别任务。最近，与传统方法相比，深度学习通过提取高级表示并在端到端流水线中解决问题而显示出令人鼓舞的结果。在本文中，我们研究了用于人群事件识别的深度体系结构，以检测PETS2009事件识别数据集中的七个行为类别。更具体地说，我们将光流图像作为输入应用集成的手工制作和Conv-LSTM-AE方法，以提取数据的高级表示并进行分类。在自动编码器（AE）的瓶颈中实现了对输入光流图像序列的潜在表示后，该体系结构被分为两个独立的分支，一个分支为AE解码器，另一个为分类器。通过定义两个不同的损失，对提出的体系结构进行了联合训练以进行表示和分类。与最先进方法相比，实验结果表明，我们的算法可用于实时事件识别，并在计算指标上实现更好的性能。

更新日期：2021-05-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11