Online dual dictionary learning for visual object tracking,Journal of Ambient Intelligence and Humanized Computing

当前位置： X-MOL 学术 › J. Ambient Intell. Human. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Online dual dictionary learning for visual object tracking
Journal of Ambient Intelligence and Humanized Computing Pub Date : 2021-01-05 , DOI: 10.1007/s12652-020-02799-x
Xu Cheng , Yifeng Zhang , Lin Zhou , Guojun Lu

Sparse representation method has been widely applied to visual tracking. Most of existing tracking algorithms based on sparse representation exploit the l₀ or l₁-norm for solving the sparse coefficients. However, it makes the execution of solution very time consuming. In this paper, we propose an effective dual dictionary learning model for visual tracking. The dictionary model is composed of discriminative dictionary and analytic dictionary; they work together to perform the representation and discrimination simultaneously. First, we exploit the object states of the first ten frames of a video to initialize the dual dictionary. In the tracking phase, the dual dictionary model is updated alternatively. Second, the local and global information of the object are integrated into the dual dictionary learning model. Sparse coefficients of the patch are used to encode the local structural information of the object. Furthermore, all the sparse coefficients within one object state form a global object representation. We develop a likelihood function that takes an adaptive threshold into consideration to de-noise the global representation. In addition, the object template is updated via an online scheme to adapt the object appearance changes. The experiments on a number of common benchmark test sets show that our approach is more effective than the existing methods.

中文翻译：

在线双字典学习，用于视觉对象跟踪

稀疏表示法已广泛应用于视觉跟踪。现有的大多数基于稀疏表示的跟踪算法都利用l ₀或l ₁-范数用于求解稀疏系数。但是，这使解决方案的执行非常耗时。在本文中，我们提出了一种有效的用于视觉跟踪的双重字典学习模型。词典模型由判别词典和解析词典组成。他们一起工作，以同时进行代表和歧视。首先，我们利用视频的前十帧的对象状态来初始化双重字典。在跟踪阶段，对偶字典模型会交替更新。其次，将对象的本地和全局信息集成到双重字典学习模型中。补丁的稀疏系数用于编码对象的局部结构信息。此外，一个对象状态内的所有稀疏系数都形成一个全局对象表示。我们开发了一种似然函数，该函数考虑了自适应阈值以对全局表示进行消噪。另外，通过在线方案更新对象模板以适应对象外观变化。在许多通用基准测试集上进行的实验表明，我们的方法比现有方法更有效。

更新日期：2021-01-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11