A quantitative attribute-based benchmark methodology for single-target visual tracking,Frontiers of Information Technology & Electronic Engineering

当前位置： X-MOL 学术 › Front. Inform. Technol. Electron. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A quantitative attribute-based benchmark methodology for single-target visual tracking
Frontiers of Information Technology & Electronic Engineering ( IF 3 ) Pub Date : 2020-04-01 , DOI: 10.1631/fitee.1900245
Wen-jing Kang , Chang Liu , Gong-liang Liu

In the past several years, various visual object tracking benchmarks have been proposed, and some of them have been used widely in numerous recently proposed trackers. However, most of the discussions focus on the overall performance, and cannot describe the strengths and weaknesses of the trackers in detail. Meanwhile, several benchmark measures that are often used in tests lack convincing interpretation. In this paper, 12 frame-wise visual attributes that reflect different aspects of the characteristics of image sequences are collated, and a normalized quantitative formulaic definition has been given to each of them for the first time. Based on these definitions, we propose two novel test methodologies, a correlation-based test and a weight-based test, which can provide a more intuitive and easier demonstration of the trackers’ performance for each aspect. Then these methods have been applied to the raw results from one of the most famous tracking challenges, the Video Object Tracking (VOT) Challenge 2017. From the tests, most trackers did not perform well when the size of the target changed rapidly or intensely, and even the advanced deep learning based trackers did not perfectly solve the problem. The scale of the targets was not considered in the calculation of the center location error; however, in a practical test, the center location error is still sensitive to the targets’ changes in size.

中文翻译：

用于单目标视觉跟踪的基于属性的定量基准方法

在过去的几年中，已经提出了各种视觉对象跟踪基准，其中一些已被广泛地用于许多最近提出的跟踪器中。但是，大多数讨论都集中在整体性能上，无法详细描述跟踪器的优缺点。同时，测试中经常使用的几种基准度量缺乏令人信服的解释。在本文中，整理了12种反映图像序列特征不同方面的逐帧视觉属性，并首次对其进行了归一化的定量公式定义。根据这些定义，我们提出了两种新颖的测试方法，一种基于相关性的测试和一种基于权重的测试，这样可以更直观，更轻松地演示跟踪器各个方面的性能。然后，这些方法已应用于最著名的跟踪挑战之一（视频对象跟踪（VOT）挑战2017）的原始结果。根据测试，当目标大小快速或剧烈变化时，大多数跟踪器的效果都不理想，甚至高级的基于深度学习的跟踪器也无法完美解决问题。在计算中心位置误差时未考虑目标的比例；但是，在实际测试中，中心位置误差仍然对目标的尺寸变化敏感。当目标大小快速变化或剧烈变化时，大多数跟踪器的性能都不会很好，甚至基于高级深度学习的跟踪器也无法完美解决问题。在计算中心位置误差时未考虑目标的比例；但是，在实际测试中，中心位置误差仍然对目标的尺寸变化敏感。当目标大小快速变化或剧烈变化时，大多数跟踪器的性能都不会很好，甚至基于高级深度学习的跟踪器也无法完美解决问题。在计算中心位置误差时未考虑目标的比例；但是，在实际测试中，中心位置误差仍然对目标的尺寸变化敏感。

更新日期：2020-04-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>