Multi-Stream 3D latent feature clustering for abnormality detection in videos,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-Stream 3D latent feature clustering for abnormality detection in videos
Applied Intelligence ( IF 5.3 ) Pub Date : 2021-05-16 , DOI: 10.1007/s10489-021-02356-9
Mujtaba Asad , He Jiang , Jie Yang , Enmei Tu , Aftab Ahmad Malik

Detection of abnormal behavior in surveillance videos is essential for public safety and monitoring. However, it needs constant human focus and attention for human-based surveillance systems, which is a challenging process. Therefore, automatic detection of such events is of great significance. Abnormal event detection is a challenging problem due to the scarceness of labelled data and the low probability of occurrence of such events. In this paper, we propose a novel multi-stream two-stage architecture to detect abnormal behavior in videos. Our contributions are three-fold: 1) In the first stage, we propose a 3D Convolutional Autoencoder (3DCAE) architecture for appearance and motion feature extraction from both video frame input and dynamic flow input streams of normal event training videos in an unsupervised manner. 2) We have used a multi-objective loss function for 3DCAE reconstruction which can focus more on foreground moving objects rather that the stationary background information. 3) In the second stage, the fused latent features from both video frames and dynamic flow inputs are grouped together into different clusters of normality. Then we eliminate the smaller or sparse clusters, which are supposed to contain noisy patterns in the training data, to represent stronger normality patterns. A Deep one-class Support Vector Data Description (SVDD) classifier is then trained on these 3D normality clusters to generate anomaly scores for each sample in 3D clusters to differentiate between normal and abnormal occurrences. Experimental results on three benchmarking datasets: UCSD Pedestrian, Shanghai Tech, and Avenue, show significant improvement in the performance compared to the state-of-the-art approaches.

中文翻译：

用于视频异常检测的多流3D潜在特征聚类

检测监视视频中的异常行为对于公共安全和监视至关重要。但是，对于基于人的监视系统，它需要不断的关注和关注，这是一个具有挑战性的过程。因此，自动检测此类事件具有重要意义。由于标记数据的缺乏和发生此类事件的可能性较低，异常事件检测是一个具有挑战性的问题。在本文中，我们提出了一种新颖的多流两阶段体系结构来检测视频中的异常行为。我们的贡献包括三个方面：1）在第一阶段，我们提出了一种3D卷积自动编码器（3DCAE）架构，用于以无监督的方式从正常事件训练视频的视频帧输入和动态流输入流中提取外观和运动特征。2）我们对3DCAE重建使用了多目标损失函数，该函数可以将更多的注意力集中在前景移动的物体上，而不是静止的背景信息上。3）在第二阶段，将来自视频帧和动态流输入的融合后的潜在特征分组到不同的正态性群集中。然后，我们消除了较小的或稀疏的群集，这些群集本应在训练数据中包含噪声模式，以表示较强的正态性模式。然后，在这些3D正态性群集上训练一个深层次的一类支持向量数据描述（SVDD）分类器，以生成3D群集中每个样本的异常评分，以区分正常事件和异常事件。三种基准测试数据集的实验结果：UCSD行人，上海科技大学和Avenue，

更新日期：2021-05-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>