A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking

Kubo, Akihiro; Meshgi, Kourosh; Ishii, Shin

doi:10.1007/s10846-020-01273-2

A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking

Published: 11 December 2020

Volume 101, article number 11, (2021)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

251 Accesses
1 Citation
Explore all metrics

Abstract

Visual object tracking remains a challenging computer vision problem with numerous real-world applications. Discriminative correlation filter (DCF)-based methods are a recent state-of-the-art approach for dealing with this problem. The learning rate when applying a DCF is typically fixed, regardless of the situation. However, this rate is important for robust tracking, insofar as real-world video sequences include a variety of dynamical changes, such as occlusions, motion blur, and deformations. In this study, we propose Meta-Q-learning Correlation Filter (MQCF), a method for dynamically determining the learning rate of a baseline DCF-based tracker based on hand-crafted features of Histogram of Oriented Gradient (HOG), by means of reinforcement learning. The incorporation of reinforcement learning enables us to train a function for an image patch that outputs a situation-dependent learning rate of the baseline tracker in an autonomous fashion. We evaluated this method using two open benchmarks, namely, OTB-2015 and VOT-2105, and found our MQCF tracker outperformed a baseline state-of-the-art tracker by 1.8% in Area Under Curve on OTB-2015, and 8.4% relative gain in Expected Average Overlap in the VOT-2015 challenge. Our results demonstrate the advantages of the so-called meta-learning with DCF-based visual object tracking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Article Google Scholar
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2544–2550. IEEE (2010)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Article Google Scholar
Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference. BMVA Press, Nottingham (2014)
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017)
Article Google Scholar
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254–265. Springer (2014)
Liu, T., Wang, G., Yang, Q.: Real-time part-based visual tracking via adaptive correlation filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4902–4912 (2015)
Bibi, A., Mueller, M., Ghanem, B.: Target response adaptation for correlation filter tracking. In: European conference on computer vision, pp. 419–433. Springer (2016)
Possegger, H., Mauthner, T., Bischof, H.: In defense of color-based model-free tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2113–2120 (2015)
Galoogahi, H.K., Sim, T., Lucey, S.: Correlation filters with limited boundaries. In: Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, pp. 4630–4638. IEEE (2015)
Kiani Galoogahi, H., Fagg, A., Lucey, S.: Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp. 1135–1143 (2017)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1430–1438 (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005. CVPR 2005. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)
Sutton, R.S., Barto, A.G., et al.: Introduction to reinforcement learning, vol. 135. MIT press Cambridge (1998)
Zhang, D., Maei, H., Wang, X., Wang, Y.-F.: Deep reinforcement learning for visual object tracking in videos. arXiv:1701.08936 (2017)
Yoo, S.Y.J.C.Y., Yun, K., Choi, J.Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning. CVPR (2017)
Supancic III, J.S., Ramanan, D.: Tracking as online decision-making: Learning a policy from streaming videos with reinforcement learning. In: ICCV, pp. 322–331 (2017)
Choi, J., Kwon, J., Lee, K.M.: Visual tracking by reinforced decision making. arXiv:1702.06291 (2017)
Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades. In: IEEE Int. Conf. on Computer Vision (ICCV), pp. 105–114 (2017)
Dong, X., Shen, J., Wang, W., Liu, Y., Shao, L., Porikli, F.: Hyperparameter optimization for tracking with continuous deep q-learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 518–527 (2018)
Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems, vol. 37. University of Cambridge, Department of Engineering (1994)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Wu, Y., Lim, J., Yang, M.-H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Article Google Scholar
Wu, Y., Lim, J., Yang, M.-H.: Online object tracking: A benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2411–2418 (2013)
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernandez, G., Vojir, T., Hager, G., Nebehay, G., Pflugfelder, R.: The visual object tracking vot2015 challenge results. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 1–23 (2015)
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: Algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)
Article MathSciNet Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)

Download references

Acknowledgments

The authors would like to thank the associate editor and anonymous reviewers for their time and constructive comments towards the improvement of our work. This research was partly supported by JSPS-KAKENHI (No. 17H06310 and 19H04180).

Author information

Authors and Affiliations

Department of Systems Science, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Akihiro Kubo & Shin Ishii
The RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
Kourosh Meshgi

Authors

Akihiro Kubo
View author publications
You can also search for this author in PubMed Google Scholar
Kourosh Meshgi
View author publications
You can also search for this author in PubMed Google Scholar
Shin Ishii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akihiro Kubo.

Additional information

Declaration

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kubo, A., Meshgi, K. & Ishii, S. A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking. J Intell Robot Syst 101, 11 (2021). https://doi.org/10.1007/s10846-020-01273-2

Download citation

Received: 26 February 2020
Accepted: 13 October 2020
Published: 11 December 2020
DOI: https://doi.org/10.1007/s10846-020-01273-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Declaration

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Declaration

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation