Efficient Scale Estimation Methods using Lightweight Deep Convolutional Neural Networks for Visual Tracking,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient Scale Estimation Methods using Lightweight Deep Convolutional Neural Networks for Visual Tracking
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-06 , DOI: arxiv-2004.02933
Seyed Mojtaba Marvasti-Zadeh, Hossein Ghanei-Yakhdan, Shohreh Kasaei

In recent years, visual tracking methods that are based on discriminative correlation filters (DCF) have been very promising. However, most of these methods suffer from a lack of robust scale estimation skills. Although a wide range of recent DCF-based methods exploit the features that are extracted from deep convolutional neural networks (CNNs) in their translation model, the scale of the visual target is still estimated by hand-crafted features. Whereas the exploitation of CNNs imposes a high computational burden, this paper exploits pre-trained lightweight CNNs models to propose two efficient scale estimation methods, which not only improve the visual tracking performance but also provide acceptable tracking speeds. The proposed methods are formulated based on either holistic or region representation of convolutional feature maps to efficiently integrate into DCF formulations to learn a robust scale model in the frequency domain. Moreover, against the conventional scale estimation methods with iterative feature extraction of different target regions, the proposed methods exploit proposed one-pass feature extraction processes that significantly improve the computational efficiency. Comprehensive experimental results on the OTB-50, OTB-100, TC-128 and VOT-2018 visual tracking datasets demonstrate that the proposed visual tracking methods outperform the state-of-the-art methods, effectively.

中文翻译：

使用轻量级深度卷积神经网络进行视觉跟踪的有效尺度估计方法

近年来，基于判别相关滤波器（DCF）的视觉跟踪方法非常有前途。然而，这些方法中的大多数都缺乏强大的规模估计技能。尽管最近大量基于 DCF 的方法在其翻译模型中利用了从深度卷积神经网络 (CNN) 中提取的特征，但视觉目标的规模仍然是通过手工制作的特征来估计的。鉴于 CNN 的利用带来了很高的计算负担，本文利用预训练的轻量级 CNN 模型提出了两种有效的尺度估计方法，不仅提高了视觉跟踪性能，而且提供了可接受的跟踪速度。所提出的方法是基于卷积特征图的整体或区域表示来制定的，以有效地集成到 DCF 公式中，以在频域中学习鲁棒的比例模型。此外，与传统的对不同目标区域进行迭代特征提取的尺度估计方法相比，所提出的方法利用所提出的一次性特征提取过程，显着提高了计算效率。在 OTB-50、OTB-100、TC-128 和 VOT-2018 视觉跟踪数据集上的综合实验结果表明，所提出的视觉跟踪方法有效地优于最先进的方法。与具有不同目标区域迭代特征提取的传统尺度估计方法相比，所提出的方法利用所提出的一次性特征提取过程，显着提高了计算效率。在 OTB-50、OTB-100、TC-128 和 VOT-2018 视觉跟踪数据集上的综合实验结果表明，所提出的视觉跟踪方法有效地优于最先进的方法。与具有不同目标区域迭代特征提取的传统尺度估计方法相比，所提出的方法利用所提出的一次性特征提取过程，显着提高了计算效率。在 OTB-50、OTB-100、TC-128 和 VOT-2018 视觉跟踪数据集上的综合实验结果表明，所提出的视觉跟踪方法有效地优于最先进的方法。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>