Multi-Stream 3D latent feature clustering for abnormality detection in videos

Asad, Mujtaba; Jiang, He; Yang, Jie; Tu, Enmei; Malik, Aftab Ahmad

doi:10.1007/s10489-021-02356-9

Multi-Stream 3D latent feature clustering for abnormality detection in videos

Published: 16 May 2021

Volume 52, pages 1126–1143, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Mujtaba Asad¹,
He Jiang¹,
Jie Yang¹,
Enmei Tu¹ &
…
Aftab Ahmad Malik²

789 Accesses
12 Citations
Explore all metrics

Abstract

Detection of abnormal behavior in surveillance videos is essential for public safety and monitoring. However, it needs constant human focus and attention for human-based surveillance systems, which is a challenging process. Therefore, automatic detection of such events is of great significance. Abnormal event detection is a challenging problem due to the scarceness of labelled data and the low probability of occurrence of such events. In this paper, we propose a novel multi-stream two-stage architecture to detect abnormal behavior in videos. Our contributions are three-fold: 1) In the first stage, we propose a 3D Convolutional Autoencoder (3DCAE) architecture for appearance and motion feature extraction from both video frame input and dynamic flow input streams of normal event training videos in an unsupervised manner. 2) We have used a multi-objective loss function for 3DCAE reconstruction which can focus more on foreground moving objects rather that the stationary background information. 3) In the second stage, the fused latent features from both video frames and dynamic flow inputs are grouped together into different clusters of normality. Then we eliminate the smaller or sparse clusters, which are supposed to contain noisy patterns in the training data, to represent stronger normality patterns. A Deep one-class Support Vector Data Description (SVDD) classifier is then trained on these 3D normality clusters to generate anomaly scores for each sample in 3D clusters to differentiate between normal and abnormal occurrences. Experimental results on three benchmarking datasets: UCSD Pedestrian, Shanghai Tech, and Avenue, show significant improvement in the performance compared to the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Fine-Tuning a Pre-trained CAE for Deep One Class Anomaly Detection in Video Footage

Appearance-motion heterogeneous networks for video anomaly detection

Article 17 October 2023

Hongjun Li, Xiaohu Sun & Mingyi Chen

References

Ruff L, Vandermeulen R, Goernitz N, Deecke L, Siddiqui S A, Binder A, Müller E, Kloft M (2018) Deep one-class classification. In: International conference on machine learning, pp 4393– 4402
Ravanbakhsh M, Nabi M, Mousavi H, Sangineto E, Sebe N (2018) Plug-and-play cnn for crowd motion analysis: an application in abnormal event detection. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1689–1698
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the ieee international conference on computer vision, pp 341–349
Del Giorno A, Bagnell J A, Hebert M (2016) A discriminative framework for anomaly detection in large videos. In: European conference on computer vision. Springer, pp 334–349
Hinami R, Mei T, Satoh S (2017) Joint detection and recounting of abnormal events by learning deep generic knowledge. In: Proceedings of the IEEE international conference on computer vision, pp 3619–3627
Tudor Ionescu R, Smeureanu S, Alexe B, Popescu M (2017) Unmasking the abnormal events in video. In: Proceedings of the IEEE international conference on computer vision, pp 2895– 2903
Ionescu R T, Smeureanu S, Popescu M, Alexe B (2019) Detecting abnormal events in video using narrowed normality clusters. In: 2019 IEEE winter conference on applications of computer vision (WACV) IEEE, pp 1951–1960
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545
Hasan M, Choi J, Neumann J, Roy-Chowdhury A K, Davis L S (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 733–742
Xu D, Yan Y, Ricci E, Sebe N (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Underst 156:117–127
Article Google Scholar
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6479–6488
Fu Z, Hu W, Tan T (2005) Similarity based vehicle trajectory clustering and anomaly detection. In: IEEE international conference on image processing 2005, vol 20 IEEE, pp II–602
Wang X, Tieu K, Grimson E (2006) Learning semantic scene models by trajectory analysis. In: European conference on computer vision. Springer, pp 110–123
Zhao B, Fei-Fei L, Xing E P (2011) Online detection of unusual events in videos via dynamic sparse coding. In: CVPR 2011. IEEE, pp 3313–3320
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: CVPR 2011. IEEE, pp 3449–3456
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp 2720–2727
Zhu X, Li X, Zhang S, Ju C, Wu X (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1275
Article MathSciNet Google Scholar
Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. IEEE Trans Image Process 23(9):3737–3750
Article MathSciNet Google Scholar
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Asad M, Yang J, He J, Shamsolmoali P, He X (2020) Multi-frame feature-fusion-based model for violence detection. Vis Comput:1–17. https://doi.org/10.1007/s00371-020-01878-6
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: application in video surveillance. Knowl-Based Syst 194:105590
Article Google Scholar
Mei S, Ji J, Geng Y, Zhang Z, Li X, Du Q (2019) Unsupervised spatial-spectral feature learning by 3d convolutional autoencoder for hyperspectral classification. IEEE Trans Geosci Remote Sens
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1975–1981
Antonakaki P, Kosmopoulos D, Perantonis S J (2009) Detecting abnormal human behaviour using multiple cameras. Signal Process 89(9):1723–1738
Article Google Scholar
Jiang F, Yuan J, Tsaftaris S A, Katsaggelos A K (2011) Anomalous video event detection using spatiotemporal context. Comput Vis Image Underst 115(3):323–333
Article Google Scholar
Adam A, Rivlin E, Shimshoni I, Reinitz D (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560
Article Google Scholar
Saligrama V, Chen Z (2012) Video anomaly detection based on local statistical aggregates. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2112–2119
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941
Zhong J-X, Li N, Kong W, Liu S, Li T H, Li G (2019) Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1237–1246
Jiang F, Wu Y, Katsaggelos A K (2007) Abnormal event detection from surveillance video by dynamic hierarchical clustering. In: 2007 IEEE international conference on image processing, vol 5. IEEE, pp V–145
Bera A, Kim S, Manocha D (2016) Realtime anomaly detection using trajectory-level crowd behavior learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 50–57
Athanesious J J, Chakkaravarthy S S, Vasuhi S, Vaidehi V (2019) Trajectory based abnormal event detection in video traffic surveillance using general potential data field with spectral clustering. Multimed Tools Appl 78(14):19877–19903
Article Google Scholar
Tokmakov P, Hebert M, Schmid C (2020) Unsupervised learning of video representations via dense trajectory clustering. arXiv:2006.15731
Izakian H, Pedrycz W, Jamal I (2013) Clustering spatiotemporal data: an augmented fuzzy c-means. IEEE Trans Fuzzy Syst 21(5):855–868
Article Google Scholar
Mashtalir SV, Stolbovyi MI, Yakovlev SV (2019) Clustering video sequences by the method of harmonic k-means. Cybern Syst Anal 55(2):200–206
Article Google Scholar
Wang J, Cherian A, Porikli F (2017) Ordered pooling of optical flow sequences for action recognition. In: 2017 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 168–176
Bilen H, Fernando B, Gavves E, Vedaldi A, Gould S (2016) Dynamic image networks for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3034–3042
Zhou J T, Zhang L, Fang Z, Du J, Peng X, Yang X (2019) Attention-driven loss for anomaly detection in video surveillance. IEEE transactions on circuits and systems for video technology
Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet Google Scholar
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15
Article Google Scholar
Abedalla L, Badarna M, Khalifa W, Yousef M (2019) K–means based one-class svm classifier. In: International conference on database and expert systems applications. Springer, pp 45–53
Wang D, Tan X (2016) Unsupervised feature learning with c-svddnet. Pattern Recogn 60:473–485
Article Google Scholar
Gu Q, Han J (2013) Clustered support vector machines. In: Artificial intelligence and statistics, pp 307–315
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Vedaldi A, Fulkerson B (2008) VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/
Chong Y S, Tay Y H (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks. Springer, pp 189–196
Luo W, Liu W, Gao S (2017) Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International conference on multimedia and expo (ICME). IEEE, pp 439–444
Zhou J T, Du J, Zhu H, Peng X, Liu Y, Goh R S M (2019) Anomalynet: an anomaly detection network for video surveillance. IEEE Trans Inf Forensic Secur 14(10):2537–2550
Article Google Scholar
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental updates. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 2921–2928
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 935–942

Download references

Acknowledgments

This research is partly supported by NSFC, China (No: 61876107,U1803261) and Shanghai Natural Science Foundation (19ZR1476300).

Author information

Authors and Affiliations

Institute of Image Processing & Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
Mujtaba Asad, He Jiang, Jie Yang & Enmei Tu
Department of Software Engineering, Lahore Garrison University, Lahore, Pakistan
Aftab Ahmad Malik

Authors

Mujtaba Asad
View author publications
You can also search for this author in PubMed Google Scholar
He Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Enmei Tu
View author publications
You can also search for this author in PubMed Google Scholar
Aftab Ahmad Malik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jie Yang or Enmei Tu.

Ethics declarations

Conflict of Interest

We have no conflict of interest to declare.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asad, M., Jiang, H., Yang, J. et al. Multi-Stream 3D latent feature clustering for abnormality detection in videos. Appl Intell 52, 1126–1143 (2022). https://doi.org/10.1007/s10489-021-02356-9

Download citation

Accepted: 13 March 2021
Published: 16 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02356-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Stream 3D latent feature clustering for abnormality detection in videos

Abstract

Access this article

Similar content being viewed by others

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Fine-Tuning a Pre-trained CAE for Deep One Class Anomaly Detection in Video Footage

Appearance-motion heterogeneous networks for video anomaly detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-Stream 3D latent feature clustering for abnormality detection in videos

Abstract

Access this article

Similar content being viewed by others

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Fine-Tuning a Pre-trained CAE for Deep One Class Anomaly Detection in Video Footage

Appearance-motion heterogeneous networks for video anomaly detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation