A real-time siamese tracker deployed on UAVs

Shen, Hao; Lin, Defu; Song, Tao

doi:10.1007/s11554-021-01190-z

A real-time siamese tracker deployed on UAVs

Original Research Paper
Published: 29 January 2022

Volume 19, pages 463–473, (2022)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Hao Shen¹,
Defu Lin¹ &
Tao Song¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Visual object tracking is an essential enabler for the automation of UAVs. Recently, Siamese network based trackers have achieved excellent performance on offline benchmarks. The Siamese network based trackers usually use classic deep and wide networks, such as AlexNet, VggNet, and ResNet, to extract the features of template frame and detection frame. However, due to the poor computing power of embedded devices, these models without modification are too heavy on calculation to be deployed on UAVs. In this paper, we propose a guideline to design a slim backbone: the dimension of output should be smaller than that of the input for every layer. Directed by the guideline, we reduce the computational requirements of AlexNet by 59.4%, while the tracker maintains a comparable accuracy. In addition, we adopt an anchor-free network as the tracking head, which requires less calculation than that of anchor-based method. Based on such approaches, our tracker achieves an AUC of 60.9% on UAV123 data set and reaches 30 frames per second on NVIDIA Jetson TX2, which, therefore, can be embedded in UAVs. To the best of our knowledge, it is the first real-time Siamese tracker deployed on the embedded system of UAVs. The code is available at GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards real-time object tracking with deep Siamese network and layerwise aggregation

Article 25 January 2021

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

Hierarchical attentive Siamese network for real-time visual tracking

Article 21 May 2019

Notes

Abbreviations

\(x^0\) :: the detection frame.
\(x^k\) :: the feature of detection frame of kth layer.
\(z^0\) :: the template frame.
\(z^k\) :: the feature of template frame of kth layer.
\(f(\cdot )\) :: the mapping of the feature extractor network.
\(f_k(\cdot )\) :: the mapping of kth layer of the feature extractor network.
\(g(\cdot )\) :: the mapping of the tracking head network.
\(w_k\) :: the width of the feature map of kth layer.
\(h_k\) :: the height of the feature map of kth layer.
\(c_k\) :: the channel of the feature map of kth layer.
\(d_k\) :: the dimension of discrete state space of kth layer.
\(X^d\) :: the state space of which the dimension is d.
\(c(\cdot )\) :: the cardinality of state space.
\(C_{cls}^x(\cdot )\) :: a convolution layer for the feature of the detection frame in classification branch.
\(C_{cls}^z(\cdot )\) :: a convolution layer for the feature of the template frame in classification branch.
\(C_{reg}^x(\cdot )\) :: a convolution layer for the feature of the detection frame in regression branch.
\(C_{reg}^z(\cdot )\) :: a convolution layer for the feature of the template frame in regression branch.
\(n_s\) :: the number of strides of the backbone.
(i, j):: the position in output map.
\(P_{cls}^{w \times h \times 2}\) :: the output of the classification branch. \(P_{reg}^{w \times h \times 4} = \left[ {\begin{array}{*{20}{c}}{d_l}&{d_t}&{d_r}&{d_b}\end{array}} \right] \) the output of the regression branch.
\(*\) :: multi-channel convolution.
box :: the bounding box of the target.
\((box_{l}, box_{t})\) :: the top-left corner point of the target.
\((box_{r}, box_{b})\) :: the bottom-right corner point of the target.
L :: the loss of training.

References

Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: European conference on computer vision, pp. 850–865. Springer (2016)
Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6182–6191 (2019)
Böttger, T., Steger, C.: Accurate and robust tracking of rigid objects in real time. J. Real-Time Image Proc. 18(3), 493–510 (2021)
Article Google Scholar
Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6668–6677 (2020)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258 (2017)
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Atom: Accurate tracking by overlap maximization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4660–4669 (2019)
Danelljan, M., Gool, L.V., Timofte, R.: Probabilistic regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7183–7192 (2020)
Deng, C., He, S., Han, Y., Boya, Z.: Learning dynamic spatial-temporal regularization for uav object tracking. IEEE Signal Processing Letters (2021)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee (2009)
Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., Tian, Q.: The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 370–386 (2018)
Fan, H., Lin, L., Yang, F., Chu, P., Deng, G., Yu, S., Bai, H., Xu, Y., Liao, C., Ling, H.: Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5374–5383 (2019)
Gurcan, I., Temizel, A.: Heterogeneous cpu-gpu tracking-learning-detection (h-tld) for real-time object tracking. J. Real-Time Image Proc. 16(2), 339–353 (2019)
Article Google Scholar
Hadfield, S., Bowden, R., Lebeda, K.: The visual object tracking vot2016 challenge results. Lect. Notes Comput. Sci. 9914, 777–823 (2016)
Article Google Scholar
He, K., Girshick, R., Dollár, P.: Rethinking imagenet pre-training. In: Proceedings of the IEEE international conference on computer vision, pp. 4918–4927 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–799 (2018)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., Vojir, T., Bhat, G., Lukezic, A., Eldesokey, A., et al.: The sixth visual object tracking vot2018 challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 0–0 (2018)
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Pflugfelder, R., Kamarainen, J.K., Cehovin Zajc, L., Drbohlav, O., Lukezic, A., Berg, A., et al.: The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Li, B., Wu, W., Wang, Q., Zhang, F., Yan, J.: Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)
Li, T., Ding, F., Yang, W.: Uav object tracking by background cues and aberrances response suppression mechanism. Neural Comput. Appl. 33(8), 3347–3361 (2021)
Article Google Scholar
Li, X., Ma, C., Wu, B., He, Z., Yang, M.H.: Target-aware deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1369–1378 (2019)
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European conference on computer vision, pp. 254–265. Springer (2014)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer (2014)
Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270 (2018)
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for uav tracking. In: European conference on computer vision, pp. 445–461. Springer (2016)
Otoom, M., Al-Louzi, M.: Enhanced tld-based video object-tracking implementation tested on embedded platforms. J. Real-Time Image Proc. 18(3), 937–952 (2021)
Article Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in neural information processing systems, pp. 8026–8037 (2019)
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104. PMLR (2018)
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)
Shen, H., Lin, D., Song, T., Gao, G.: Anti-distractors: two-branch siamese tracker with both static and dynamic filters for object tracking. Multimed. Syst. 16(4), 1522–1530 (2018)
Google Scholar
Sheu, M.H., Jhang, Y.S., Morsalin, S., Huang, Y.F., Sun, C.C., Lai, S.C.: Uav object tracking application based on patch color group feature on embedded system. Electronics 10(15), 1864 (2021)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Varfolomieiev, A., Lysenko, O.: An improved algorithm of median flow for visual object tracking and its implementation on arm platform. J. Real-Time Image Proc. 11(3), 527–534 (2016)
Article Google Scholar
Whitehead, N., Fit-Florea, A.: Precision & performance: Floating point and ieee 754 compliance for nvidia gpus. rn (A+ B) 21(1), 18749–19424 (2011)
Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2411–2418 (2013)
Xu, Y., Wang, Z., Li, Z., Yuan, Y., Yu, G.: Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: AAAI, pp. 12549–12556 (2020)
Yan, B., Peng, H., Wu, K., Wang, D., Fu, J., Lu, H.: Lighttrack: Finding lightweight neural networks for object tracking via one-shot architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15180–15189 (2021)
Yang, S., Chen, H., Xu, F., Li, Y., Yuan, J.: High-performance uavs visual tracking based on siamese network. The Visual Computer pp. 1–17 (2021)
Zhang, W., Song, K., Rong, X., Li, Y.: Coarse-to-fine uav target tracking with deep reinforcement learning. IEEE Trans. Autom. Sci. Eng. 16(4), 1522–1530 (2018)
Article Google Scholar
Zhao, X., Zhou, S., Lei, L., Deng, Z.: Siamese network for object tracking in aerial video. In: 2018 IEEE 3rd international conference on image, vision and computing (ICIVC), pp. 519–523. IEEE (2018)
Zhu, M., Zhang, H., Zhang, J., Zhuo, L.: Multi-level prediction siamese network for real-time uav visual tracking. Image Vis. Comput. 103, 104002 (2020)
Article Google Scholar
Zhu, P., Wen, L., Du, D., Bian, X., Ling, H., Hu, Q., Wu, H., Nie, Q., Cheng, H., Liu, C., et al.: Visdrone-vdt2018: The vision meets drone video detection and tracking challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 0–0 (2018)

Download references

Author information

Authors and Affiliations

Beijing Key Laboratory of UAV Autonomous Control, Beijing Institute of Technology, Haidian, Beijing, 100081, China
Hao Shen, Defu Lin & Tao Song

Authors

Hao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Defu Lin
View author publications
You can also search for this author in PubMed Google Scholar
Tao Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, H., Lin, D. & Song, T. A real-time siamese tracker deployed on UAVs. J Real-Time Image Proc 19, 463–473 (2022). https://doi.org/10.1007/s11554-021-01190-z

Download citation

Received: 15 May 2021
Accepted: 15 December 2021
Published: 29 January 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11554-021-01190-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A real-time siamese tracker deployed on UAVs

Abstract

Access this article

Similar content being viewed by others

Towards real-time object tracking with deep Siamese network and layerwise aggregation

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

Hierarchical attentive Siamese network for real-time visual tracking

Notes

Abbreviations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A real-time siamese tracker deployed on UAVs

Abstract

Access this article

Similar content being viewed by others

Towards real-time object tracking with deep Siamese network and layerwise aggregation

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

Hierarchical attentive Siamese network for real-time visual tracking

Notes

Abbreviations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation