Skip to main content
Log in

CNN-based single object detection and tracking in videos and its application to drone detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents convolutional neural network (CNN)-based single object detection and tracking algorithms. CNN-based object detection methods are directly applicable to static images, but not to videos. On the other hand, model-free visual object tracking methods cannot detect an object until a ground truth bounding box of the target is provided. Moreover, many annotated video datasets of the target object are required to train both the object detectors and visual trackers. In this work, three simple yet effective object detection and tracking algorithms for videos are proposed to efficiently combine a state-of-the-art object detector and visual tracker for circumstances in which only a few static images of the target are available for training. The proposed algorithms are tested using a drone detection task and the experimental results demonstrated their effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Aker C, Kalkan S (2017) Using deep networks for drone detection. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6

  2. Babenko B, Yang M-H, Belongie S (2010) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Machine Intell 33 (8):1619–1632

    Article  Google Scholar 

  3. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. In: European conference on computer vision. Springer, pp 850–865

  4. Chen L, Lou J, Xu F, Ren M (2019) Grid-based multi-object tracking with siamese cnn based appearance edge and access region mechanism. Multimed Tools Applic: 1–19

  5. Chen X, Yu J, Wu Z (2019) Temporally identity-aware ssd with attentional lstm. IEEE Trans Cybern

  6. Coluccia A, Fascista A, Schumann A, Sommer L, Ghenescu M, Piatrik T, De Cubber G, Nalamati M, Kapoor A, Saqib M, et al. (2019) Drone-vs-bird detection challenge at ieee avss2019. In: 2019 16th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–7

  7. Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect. In: Proceedings of the IEEE international conference on computer vision, pp 3038–3046

  8. Hassanin A-AIM, El-Samie FEA, Banby GME (2019) A real-time approach for automatic defect detection from pcbs based on surf features and morphological operations. Multimed Tools Applic: 1–21

  9. Hatanaka T, Funada R, Fujita M (2019) Visual surveillance of human activities via gradient-based coverage control on matrix manifolds. IEEE Trans Control Sys Technol

  10. He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4834–4843

  11. Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  12. Kang K, Li H, Xiao T, Ouyang W, Yan J, Liu X, Wang X (2017) Object detection in videos with tubelet proposal networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 727–735

  13. Kang K, Ouyang W, Li H, Wang X (2016) Object detection from video tubelets with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 817–825

  14. Kouicem DE (2017) Security internet of everything for systems of systems. Journée des Doctorants: 51

  15. Kristan M, Matas J, Leonardis A, Felsberg M, Pflugfelder R, Kamarainen J-K, Zajc LC, Drbohlav O, Lukezic A, Berg A, et al. (2019) The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE international conference on computer vision workshops, pp 0–0

  16. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8971–8980

  17. Li Z, Zheng Z, Lin F, Leung H, Li Q (2019) Action recognition from depth sequence using depth motion maps-based local ternary patterns and cnn. Multimed Tools Applic: 1–15

  18. Liu M, Zhu M (2018) Mobile video object detection with temporally-aware feature maps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5686–5695

  19. Lukezic A, Vojir T, Zajc LC, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6309–6318

  20. Niu J, Liu Y, Guizani M, Ouyang Z (2019) Deep cnn-based real-time traffic light detector for self-driving vehicles. IEEE Trans Mobile Comput

  21. Paliwal N, Vanjani P, Liu J-W, Saini S, Sharma A (2019) Image processing-based intelligent robotic system for assistance of agricultural crops. Int J Social Humanistic Comput 3(2):191–204

    Article  Google Scholar 

  22. Park J, Kim DH, Shin YS, Lee S-H (2017) A comparison of convolutional object detectors for real-time drone tracking using a ptz camera. In: 2017 17th international conference on control, automation and systems (ICCAS). IEEE, pp 696–699

  23. Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5296–5305

  24. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  25. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  26. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767

  27. Rozantsev A, Lepetit V, Fua P (2016) Detecting flying objects using a single moving camera. IEEE Trans Pattern Anal Mach Intell 39(5):879–892

    Article  Google Scholar 

  28. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  29. Saqib M, Khan SD, Sharma N, Blumenstein M (2017) A study on detecting drones using deep convolutional neural networks. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–5

  30. Shi X, Yang C, Xie W, Liang C, Shi Z, Chen J (2018) Anti-drone system with multiple surveillance technologies: architecture, implementation, and challenges. IEEE Commun Mag 56(4):68–74

    Article  Google Scholar 

  31. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  32. Smeulders Arnold WM, Chu DM, Cucchiara R, Calderara S, Dehghan A, Shah M (2013) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468

    Google Scholar 

  33. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  34. Thai V-P, Zhong W, Pham T, Alam S, Duong V u (2019) Detection, tracking and classification of aircraft and drones in digital towers using machine learning on motion patterns. In: 2019 Integrated communications, navigation and surveillance conference (ICNS). IEEE, pp 1–8

  35. Wang Y, Luo X, Ding L, Wu J, Fu S (2019) Robust visual tracking via a hybrid correlation filter. Multimed Tools Applic 78(22):31633–31648

    Article  Google Scholar 

  36. Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418

  37. Wu XY (2019) A hand gesture recognition algorithm based on dc-cnn. Multimed Tools Applic: 1–13

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MIST) (No.2017R1C1B5017125 and No.2019R1F1A1040709)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong-Hyun Lee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, DH. CNN-based single object detection and tracking in videos and its application to drone detection. Multimed Tools Appl 80, 34237–34248 (2021). https://doi.org/10.1007/s11042-020-09924-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09924-0

Keywords

Navigation