Skip to main content
Log in

Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems

  • PATTERN RECOGNITION AND IMAGE PROCESSING
  • Published:
Journal of Computer and Systems Sciences International Aims and scope

Abstract

An effective (performance- and accuracy-wise) computational method for pattern recognition in a continuous video stream using deep neural networks for access control systems is proposed. The class of recognition problems solved by the method using a sequence of video stream frames is identified: the vehicle itself and the characters on its license plate (LP), faces of people, and abnormal situations. In contrast to the known solutions, a classification with a subsequent reinforcement based on multiple frames of a video stream and with an algorithm for the automatic annotation of images is used. Neural network architectures with independent recurrent layers for classifying video fragments adapted for the problems, a dual network for face recognition, and a deep neural network for vehicle character recognition are proposed. New databases for neural network training are created. A schematic diagram of an intelligent access control system for ensuring the security of an enterprise, a distinctive feature of which is the use of a multirotor unmanned aerial vehicle with a computing unit, is proposed. Field experiments are carried out, and the accuracy and performance of the computational method in solving each problem are assessed. Software modules in the Python language for solving tasks of the intelligent access control system are developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.

Similar content being viewed by others

REFERENCES

  1. A. M. Alimi, U. Pal, M. B. Halima, and Z. Selmi, “DELP-DAR system for license plate detection and recognition,” Pattern Recogn. Lett., No. 129, 213–223 (2020).

  2. S. M. Silva and C. R. Jung, “License plate detection and recognition in unconstrained scenarios,” in Proceedings of the European Conference on Computer Vision (ECCV), Germany, Munich,2018, pp. 580–596.

  3. K. S. Aarathi and A. Abraham, “Vehicle color recognition using deep learning for hazy images,” in Proceedings of the International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India,2017, pp. 335–339.

  4. B. Amos, B. Ludwiczuk, and M. Satyanarayanan, “OpenFace: A general-purpose face recognition library with mobile applications,” Tech. Rep. CMU-CS-16-118 (CMU School of Computer Science, 2016). www.cs.cmu.edu/~satya/docdir/CMU-CS-16-118.pdf. Accessed August 20, 2019.

  5. S. Chen, Y. Liu, X. Gao, and Z. Han, “MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices,” in Proceedings of the Chinese Conference on Biometric Recognition (CCBR), Urumchi, China,2018, pp. 428–438.

  6. Results Page, Labeled Faces in the Wild. http://vis-www.cs.umass.edu/lfw/results.html. Accessed January 10, 2020.

  7. F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), MA, USA, Boston,2015, pp. 815–823.

  8. D. Organisciak, C. Riachy, N. Aslam, and H. P. H. Shum, “Triplet loss with channel attention for person re-identification,” J. WSCG 27, 161–169 (2019).

    Article  Google Scholar 

  9. R. Hinami, T. Mei, and S. Satoh, “Joint detection and recounting of abnormal events by learning deep generic knowledge,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy,2017, pp. 3619–3627.

  10. M. R. Anala, M. Makker, and A. Ashok, “Anomaly detection in surveillance videos,” in Proceedings of the 26th International Conference on High Performance Computing. Data and Analytics Workshop (HiPCW), Hyderabad, India,2019, pp. 93–98.

  11. O. S. Amosov, “Markov sequence filtering on the basis of bayesian and neural network approaches and fuzzy logic systems in navigation data processing,” J. Comput. Syst. Sci. Int. 43 (4), pp. 551–559 (2004).

    Google Scholar 

  12. ImageNet. http://www.image-net.org/. Accessed December 15, 2019.

  13. Machine Learning Tips and Tricks Cheatsheet. https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks. Accessed December 20, 2019.

  14. O. S. Amosov, S. G. Baena, Y. S. Ivanov, and S. Htike, “Roadway gate automatic control system with the use of fuzzy inference and computer vision technologies,” in Proceedings of the 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), Siem Reap, Cambodia,2017, pp. 707–712.

  15. O. S. Amosov, S. G. Amosova, and S. N. Ivanov, “Automatic access to the premises of increased danger using intelligent electric drive,” in Proceedings of the IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan, 13–17 April 2018, pp. 532–535.

  16. O. S. Amosov, Y. S. Ivanov, and S. V. Zhiganov, “Semantic video segmentation with using ensemble of particular classifiers and a deep neural network for systems of detecting abnormal situations,” IT Industry 6, 14–19 (2018).

    Google Scholar 

  17. A. Kendall, V. Badrinarayanan, and R. Cipolla, “Bayesian SegNet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding,” in Proceedings of the British Machine Vision Conference (BMVC), UK, London,2017, Vol. 57, pp. 57.1–57.12.

  18. O. S. Amosov, Yu. S. Ivanov, and S. V. Zhiganov, “Human localization in video frames using a growing neural gas algorithm and fuzzy inference,” Computer Optics 41 (1), 46–58 (2017).

    Article  Google Scholar 

  19. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA,2016, pp. 2818–2826.

  20. W. Sultani, C. Chen, and M. Shah, “Real-world anomaly detection in surveillance videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 6479–6488.

  21. S. Li, W. Li, C. Cook, C. Zhu, and Y. Gao, “Independently recurrent neural network (IndRNN): Building a longer and deeper RNN,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 5457–5466.

  22. COCO dataset. http://mscoco.org/. Accessed January 10, 2020.

  23. K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proceedings of the ACL-2002 40th Annual Meeting of the Association for Computational Linguistics, Pennsylvania, USA, Philadelphia,2002, pp. 311–318.

  24. J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA,2016, pp. 779–788.

  25. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), MA, USA, Boston,2015, pp. 1–9.

  26. P. J. Viola and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” Int. J. Comput. Vision 63, 153–161 (2005).

    Article  Google Scholar 

  27. Computer Vision Library OpenCV. https://github.com/opencv/opencv. Accessed February 20, 2020.

  28. G. Yadav, S. Maheshwari, and A. Agarwal, “Contrast limited adaptive histogram equalization based enhancement for real time video system,” in Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI), India, New Delhi,2014, pp. 2392–2397.

  29. J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide baseline stereo from maximally stable extremal regions,” Image and Vision Comput. 22, 761–767 (2004).

    Article  Google Scholar 

  30. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision,” arXiv: 1704.04861 (2017).

  31. Video from the CCTV Camera SKUD FSBEI HE KnAGU. http://evernow.ru/acs.zip. Accessed January 30, 2020.

  32. Tesseract OCR. https://github.com/tesseract-ocr/tesseract. Accessed January 20, 2020.

  33. O. S. Amosov, Y. S. Ivanov, and S. V. Zhiganov, “Human localization in the video stream using the algorithm based on growing neural gas and fuzzy inference,” in Proceedings of the 12th Intelligent Systems Symposium (INTELS’16), Proc. Comput. Sci. 103, 403–490 (2017).

  34. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, “SSD: Single shot MultiBox detector,” in Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016, Lect. Notes Comput. Sci. 9905, 21–37 (2016).

  35. R. Shaoqing, C. Xudong, W. Yichen, and S. Jian, “Face alignment at 3000 FPS via regressing local binary features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC,2014, pp. 1685–1692.

  36. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 4510–4520.

  37. Y. Dong, L. Zhen, L. Shengcai, and Z. L. Stan, “Learning face representation from scratch,” arXiv: 1411.7923 (2014).

  38. Labeled Faces in the Wild. http://vis-www.cs.umass.edu/lfw/. Accessed February 10, 2020.

Download references

Funding

This work was supported by the Ministry of Education and Science of the Russian Federation as part of a state task, project no. 2.1898.2017/PCH, on the “Development of Mathematical and Algorithmic Support of an Intelligent Information and Telecommunication Security System of the University.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. S. Amosov.

Additional information

Translated by O. Pismenov

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amosov, O.S., Amosova, S.G., Zhiganov, S.V. et al. Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems. J. Comput. Syst. Sci. Int. 59, 712–727 (2020). https://doi.org/10.1134/S1064230720050020

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1064230720050020

Navigation