Skip to main content
Log in

RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Real-time detection of human activities has become very important in terms of surveillance and security of Bank-Automated Teller Machines (ATMs), public offices because of the day-to-day increase in criminal activities. The current way of monitoring such constrained environments is done through monocular CCTV cameras which capture only RGB video. The RGB+D sensor provides depth data of the scene in addition to RGB data. To address the problem of online detection of abnormal activities in Bank ATMs, we propose a supervised deep learning framework based on multi-stream CNNs and RGB+D sensor. From the online video stream of RGB+D data, motion templates are created from RGB and depth video segments and then trained on CNNs to detect a suspicious event in ongoing activity. Moreover, due to the unavailability of any dataset for analyzing human activities in ATMs, we also contributed a novel RGB+D dataset in this paper. The proposed deep learning-based framework is evaluated on qualitative and quantitative statistical evaluation parameters and detect suspicious event with the precision of 0.932 and accuracy of 94.2%. Detailed statistical analysis of results shows that the proposed framework can detect the suspicious event in a real-time online manner before the abnormal activity gets completed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Hu, J.-F., Zheng, W.-S., Lai, J., Zhang, J.: Jointly learning heterogeneous features for RGB-D activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5344–5352 (2015)

  2. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16 (2011)

    Article  Google Scholar 

  3. Yun, K., Honorio, J., Chattopadhyay, D., Berg, T. L., Samaras, D.: Two-person interaction detection using body-pose features and multiple instance learning. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, IEEE, pp. 28–35 (2012)

  4. Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from RGBD images. In: Robotics and Automation (ICRA), 2012 IEEE International Conference on, IEEE, pp. 842–849 (2012)

  5. Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley mhad: a comprehensive multimodal human action database. In: Applications of Computer Vision (WACV), 2013 IEEE Workshop on, IEEE, pp. 53–60 (2013)

  6. Chen, C., Jafari, R., Kehtarnavaz, N.: Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Image Processing (ICIP), 2015 IEEE International Conference on, IEEE, pp. 168–172 (2015)

  7. Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu RGB+ D: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1010–1019 (2016)

  8. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)

    Article  Google Scholar 

  9. Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1057–1060 (2012)

  10. Liu, F., Tang, J., Zhao, R., Tang, Z.: Abnormal behavior recognition system for atm monitoring by RGB-D camera. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1295–1296 (2012)

  11. Nar, R., Singal, A., Kumar, P.: Abnormal activity detection for bank ATM surveillance. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp. 2042–2046 (2016)

  12. Lee, W.-K., Leong, C.-F., Lai, W.-K., Leow, L.-K., Yap, T.-H.: Archcam: real time expert system for suspicious behaviour detection in ATM site. Expert Syst. Appl. 109, 12–24 (2018)

    Article  Google Scholar 

  13. Imran, J., Kumar, P.: Human action recognition using RGB-D sensor and deep convolutional neural networks. In: international conference on advances in computing, communications and informatics (ICACCI). IEEE 2016, 144–148 (2016)

  14. Khaire, P., Kumar, P., Imran, J.: Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recogn. Lett. 115, 107–116 (2018)

    Article  Google Scholar 

  15. Liu, M., Yuan, J.: Recognizing human actions as the evolution of pose estimation maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1159–1168 (2018)

  16. McNally, W., Wong, A., McPhee, J.: Star-net: action recognition using spatio-temporal activation reprojection. In: 2019 16th Conference on Computer and Robot Vision (CRV), IEEE, pp. 49–56 (2019)

  17. Huynh-The, T., Hua, C.-H., Kim, D.-S.: Encoding pose features to images with data augmentation for 3-d action recognition. IEEE Trans. Industr. Inf. 16(5), 3100–3111 (2019)

    Article  Google Scholar 

  18. Zhang, E., Xue, B., Cao, F., Duan, J., Lin, G., Lei, Y.: Fusion of 2d CNN and 3d densenet for dynamic gesture recognition. Electronics 8(12), 1511 (2019)

    Article  Google Scholar 

  19. Wang, P., Li, W., Li, C., Hou, Y.: Action recognition based on joint trajectory maps with convolutional neural networks. Knowl.-Based Syst. 158, 43–53 (2018)

    Article  Google Scholar 

  20. Chen, Y., Wang, L., Li, C., Hou, Y., Li, W.: Convnets-based action recognition from skeleton motion maps. Multimed. Tools Appl. 79(3), 1707–1725 (2020)

    Article  Google Scholar 

  21. Liu, M., Meng, F., Chen, C., Wu, S.: Joint dynamic pose image and space time reversal for human action recognition from videos. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 8762–8769 (2019)

  22. Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., Feng, D.D.: Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Trans. Syst. Man Cybernet.: Syst. 49(9), 1806–1819 (2018)

    Article  Google Scholar 

  23. Ahad, M.A.R., Tan, J.K., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)

    Article  Google Scholar 

  24. Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real-Time Image Proc. 12(1), 155–163 (2016)

    Article  Google Scholar 

  25. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556

  26. Mansur, A., Makihara, Y., Yagi, Y.: Inverse dynamics for action recognition. IEEE Trans. Cybernet. 43(4), 1226–1236 (2013)

    Article  Google Scholar 

  27. Karg, M., Kirsch, A.: Simultaneous plan recognition and monitoring (spram) for robot assistants, (2013)

  28. Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013)

    Article  Google Scholar 

  29. Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2013)

    Google Scholar 

  30. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp. 2720–2727 (2013)

  31. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

  32. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)

  33. Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 1933–1941 (2017)

  34. Chong, Y. S., Tay, Y. H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, Springer, pp. 189–196 (2017)

  35. Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., Ogunbona, P.O.: Action recognition from depth maps using deep convolutional neural networks. IEEE Trans. Hum.-Mach. Syst. 46(4), 498–509 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Science and Engineering Research Board (SERB) under Project No. ECR/2016/000387, in cooperation with the Department of Science and Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pushpajit A. Khaire.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 1084 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khaire, P.A., Kumar, P. RGB+D and deep learning-based real-time detection of suspicious event in Bank-ATMs. J Real-Time Image Proc 18, 1789–1801 (2021). https://doi.org/10.1007/s11554-021-01155-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-021-01155-2

Keywords

Navigation