当前位置: X-MOL 学术Comput. Vis. Image Underst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LSTM guided ensemble correlation filter tracking with appearance model pool
Computer Vision and Image Understanding ( IF 4.5 ) Pub Date : 2020-02-25 , DOI: 10.1016/j.cviu.2020.102935
Monika Jain , Subramanyam A.V. , Simon Denman , Sridha Sridharan , Clinton Fookes

Deep learning based visual trackers have the potential to provide good performance for object tracking. Most of them use hierarchical features learned from multiple layers of a deep network. However, issues related to deterministic aggregation of these features from various layers, difficulties in estimating variations in scale or rotation of the object being tracked, as well as challenges in effectively modelling the object’s appearance over long time periods leaves substantial scope to improve performance. In this paper, we propose a tracker that learns correlation filters over features from multiple layers of a VGG network. A correlation filter for an individual layer is used to predict the target location. We adaptively learn the contribution of an ensemble of correlation filters for the final location estimation using an LSTM. An adaptive approach is advantageous as different layers encode diverse feature representations and a uniform contribution would not fully exploit this constrastive information. To this end, we use an LSTM as it encodes the interactions for past appearances which is useful for tracking. Further, the scale and rotation parameters are estimated using respective correlation filters. Additionally, an appearance model pool is used that prevents the correlation filter from drifting. Experimental results achieved on five public datasets - Object Tracking Benchmark (OTB100), Visual Object Tracking (VOT) Benchmark 2016, VOT Benchmark 2017, Tracking Dataset and UAV123 Dataset, reveal that our approach outperforms state of the art approaches for object tracking.



中文翻译:

LSTM引导的具有外观模型池的集成相关滤波器跟踪

基于深度学习的视觉跟踪器有可能为对象跟踪提供良好的性能。它们中的大多数使用从深度网络的多层中学习到的分层功能。但是,与这些特征从各个层的确定性聚合有关的问题,估计被跟踪对象的比例或旋转变化的困难以及在长时间内有效建模对象外观方面的挑战,都为改进性能留下了很大的余地。在本文中,我们提出了一种跟踪器,该跟踪器可以从VGG网络的多层中学习有关特征的相关过滤器。用于单个层的相关滤波器用于预测目标位置。我们使用LSTM自适应地学习相关滤波器集合对最终位置估计的贡献。自适应方法是有利的,因为不同的层对不同的特征表示进行编码,并且统一的贡献不会充分利用此有益信息。为此,我们使用LSTM,因为它对过去的外观进行交互编码,这对于跟踪很有用。此外,使用各自的相关滤波器来估计比例和旋转参数。此外,使用外观模型池可防止相关过滤器漂移。在五个公共数据集上取得的实验结果-对象跟踪基准(OTB100),可视对象跟踪(VOT)基准2016,VOT基准2017,跟踪数据集和UAV123数据集显示,我们的方法优于最新的对象跟踪方法。

更新日期:2020-02-25
down
wechat
bug