Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems

Amosov, O. S.; Amosova, S. G.; Zhiganov, S. V.; Ivanov, Yu. S.; Pashchenko, F. F.

doi:10.1134/S1064230720050020

Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems

PATTERN RECOGNITION AND IMAGE PROCESSING
Published: 11 October 2020

Volume 59, pages 712–727, (2020)
Cite this article

Journal of Computer and Systems Sciences International Aims and scope

O. S. Amosov¹,
S. G. Amosova¹,
S. V. Zhiganov²,
Yu. S. Ivanov² &
…
F. F. Pashchenko¹

143 Accesses
8 Citations
Explore all metrics

Abstract

An effective (performance- and accuracy-wise) computational method for pattern recognition in a continuous video stream using deep neural networks for access control systems is proposed. The class of recognition problems solved by the method using a sequence of video stream frames is identified: the vehicle itself and the characters on its license plate (LP), faces of people, and abnormal situations. In contrast to the known solutions, a classification with a subsequent reinforcement based on multiple frames of a video stream and with an algorithm for the automatic annotation of images is used. Neural network architectures with independent recurrent layers for classifying video fragments adapted for the problems, a dual network for face recognition, and a deep neural network for vehicle character recognition are proposed. New databases for neural network training are created. A schematic diagram of an intelligent access control system for ensuring the security of an enterprise, a distinctive feature of which is the use of a multirotor unmanned aerial vehicle with a computing unit, is proposed. Field experiments are carried out, and the accuracy and performance of the computational method in solving each problem are assessed. Software modules in the Python language for solving tasks of the intelligent access control system are developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9.

Quo vadis artificial intelligence?

Article Open access 07 March 2022

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Article 12 August 2023

Artificial Intelligence and Internet of Things for Autonomous Vehicles

REFERENCES

A. M. Alimi, U. Pal, M. B. Halima, and Z. Selmi, “DELP-DAR system for license plate detection and recognition,” Pattern Recogn. Lett., No. 129, 213–223 (2020).
S. M. Silva and C. R. Jung, “License plate detection and recognition in unconstrained scenarios,” in Proceedings of the European Conference on Computer Vision (ECCV), Germany, Munich,2018, pp. 580–596.
K. S. Aarathi and A. Abraham, “Vehicle color recognition using deep learning for hazy images,” in Proceedings of the International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India,2017, pp. 335–339.
B. Amos, B. Ludwiczuk, and M. Satyanarayanan, “OpenFace: A general-purpose face recognition library with mobile applications,” Tech. Rep. CMU-CS-16-118 (CMU School of Computer Science, 2016). www.cs.cmu.edu/~satya/docdir/CMU-CS-16-118.pdf. Accessed August 20, 2019.
S. Chen, Y. Liu, X. Gao, and Z. Han, “MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices,” in Proceedings of the Chinese Conference on Biometric Recognition (CCBR), Urumchi, China,2018, pp. 428–438.
Results Page, Labeled Faces in the Wild. http://vis-www.cs.umass.edu/lfw/results.html. Accessed January 10, 2020.
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), MA, USA, Boston,2015, pp. 815–823.
D. Organisciak, C. Riachy, N. Aslam, and H. P. H. Shum, “Triplet loss with channel attention for person re-identification,” J. WSCG 27, 161–169 (2019).
Article Google Scholar
R. Hinami, T. Mei, and S. Satoh, “Joint detection and recounting of abnormal events by learning deep generic knowledge,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy,2017, pp. 3619–3627.
M. R. Anala, M. Makker, and A. Ashok, “Anomaly detection in surveillance videos,” in Proceedings of the 26th International Conference on High Performance Computing. Data and Analytics Workshop (HiPCW), Hyderabad, India,2019, pp. 93–98.
O. S. Amosov, “Markov sequence filtering on the basis of bayesian and neural network approaches and fuzzy logic systems in navigation data processing,” J. Comput. Syst. Sci. Int. 43 (4), pp. 551–559 (2004).
Google Scholar
ImageNet. http://www.image-net.org/. Accessed December 15, 2019.
Machine Learning Tips and Tricks Cheatsheet. https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks. Accessed December 20, 2019.
O. S. Amosov, S. G. Baena, Y. S. Ivanov, and S. Htike, “Roadway gate automatic control system with the use of fuzzy inference and computer vision technologies,” in Proceedings of the 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), Siem Reap, Cambodia,2017, pp. 707–712.
O. S. Amosov, S. G. Amosova, and S. N. Ivanov, “Automatic access to the premises of increased danger using intelligent electric drive,” in Proceedings of the IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan, 13–17 April 2018, pp. 532–535.
O. S. Amosov, Y. S. Ivanov, and S. V. Zhiganov, “Semantic video segmentation with using ensemble of particular classifiers and a deep neural network for systems of detecting abnormal situations,” IT Industry 6, 14–19 (2018).
Google Scholar
A. Kendall, V. Badrinarayanan, and R. Cipolla, “Bayesian SegNet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding,” in Proceedings of the British Machine Vision Conference (BMVC), UK, London,2017, Vol. 57, pp. 57.1–57.12.
O. S. Amosov, Yu. S. Ivanov, and S. V. Zhiganov, “Human localization in video frames using a growing neural gas algorithm and fuzzy inference,” Computer Optics 41 (1), 46–58 (2017).
Article Google Scholar
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA,2016, pp. 2818–2826.
W. Sultani, C. Chen, and M. Shah, “Real-world anomaly detection in surveillance videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 6479–6488.
S. Li, W. Li, C. Cook, C. Zhu, and Y. Gao, “Independently recurrent neural network (IndRNN): Building a longer and deeper RNN,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 5457–5466.
COCO dataset. http://mscoco.org/. Accessed January 10, 2020.
K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proceedings of the ACL-2002 40th Annual Meeting of the Association for Computational Linguistics, Pennsylvania, USA, Philadelphia,2002, pp. 311–318.
J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA,2016, pp. 779–788.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), MA, USA, Boston,2015, pp. 1–9.
P. J. Viola and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” Int. J. Comput. Vision 63, 153–161 (2005).
Article Google Scholar
Computer Vision Library OpenCV. https://github.com/opencv/opencv. Accessed February 20, 2020.
G. Yadav, S. Maheshwari, and A. Agarwal, “Contrast limited adaptive histogram equalization based enhancement for real time video system,” in Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI), India, New Delhi,2014, pp. 2392–2397.
J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide baseline stereo from maximally stable extremal regions,” Image and Vision Comput. 22, 761–767 (2004).
Article Google Scholar
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision,” arXiv: 1704.04861 (2017).
Video from the CCTV Camera SKUD FSBEI HE KnAGU. http://evernow.ru/acs.zip. Accessed January 30, 2020.
Tesseract OCR. https://github.com/tesseract-ocr/tesseract. Accessed January 20, 2020.
O. S. Amosov, Y. S. Ivanov, and S. V. Zhiganov, “Human localization in the video stream using the algorithm based on growing neural gas and fuzzy inference,” in Proceedings of the 12th Intelligent Systems Symposium (INTELS’16), Proc. Comput. Sci. 103, 403–490 (2017).
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, “SSD: Single shot MultiBox detector,” in Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, 2016, Lect. Notes Comput. Sci. 9905, 21–37 (2016).
R. Shaoqing, C. Xudong, W. Yichen, and S. Jian, “Face alignment at 3000 FPS via regressing local binary features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC,2014, pp. 1685–1692.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA,2018, pp. 4510–4520.
Y. Dong, L. Zhen, L. Shengcai, and Z. L. Stan, “Learning face representation from scratch,” arXiv: 1411.7923 (2014).
Labeled Faces in the Wild. http://vis-www.cs.umass.edu/lfw/. Accessed February 10, 2020.

Download references

Funding

This work was supported by the Ministry of Education and Science of the Russian Federation as part of a state task, project no. 2.1898.2017/PCH, on the “Development of Mathematical and Algorithmic Support of an Intelligent Information and Telecommunication Security System of the University.”

Author information

Authors and Affiliations

Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia
O. S. Amosov, S. G. Amosova & F. F. Pashchenko
Komsomolsk-na-Amure State University, Komsomolsk-na-Amure, Russia
S. V. Zhiganov & Yu. S. Ivanov

Authors

O. S. Amosov
View author publications
You can also search for this author in PubMed Google Scholar
S. G. Amosova
View author publications
You can also search for this author in PubMed Google Scholar
S. V. Zhiganov
View author publications
You can also search for this author in PubMed Google Scholar
Yu. S. Ivanov
View author publications
You can also search for this author in PubMed Google Scholar
F. F. Pashchenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to O. S. Amosov.

Additional information

Translated by O. Pismenov

Rights and permissions

Reprints and permissions

About this article

Cite this article

Amosov, O.S., Amosova, S.G., Zhiganov, S.V. et al. Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems. J. Comput. Syst. Sci. Int. 59, 712–727 (2020). https://doi.org/10.1134/S1064230720050020

Download citation

Received: 13 January 2019
Revised: 01 May 2020
Accepted: 25 May 2020
Published: 11 October 2020
Issue Date: September 2020
DOI: https://doi.org/10.1134/S1064230720050020

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions