Long short-term memory improved Siamese network for robust target tracking,Journal of Electronic Imaging

当前位置： X-MOL 学术 › J. Electron. Imaging › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Long short-term memory improved Siamese network for robust target tracking
Journal of Electronic Imaging ( IF 1.0 ) Pub Date : 2021-02-01 , DOI: 10.1117/1.jei.30.1.013017
Yaping Li ₁ , Jinfu Yang ₁ , Zhiyong Li ₁

Affiliation

Visual target tracking is an important function in real-time video monitoring application, whose performance determines the implementation of many advanced tasks. At present, Siamese-network trackers based on template matching show great potential. It has the advantage of balance between accuracy and speed, due to the pre-trained convolutional network to extract deep features for target representation and off-line tracking of each frame. During tracking, however, the target template feature is only obtained from the first frame of the video in the existing algorithms. The tracking performance is completely depending on the framework of template matching, resulting in the independence of frames and ignoring the feature of inter-frame connection of video sequence. Therefore, the existing algorithms do not perform well in the face of large deformation and severe occlusion. We propose a long short-term memory (LSTM) improved Siamese network (LSiam) model, which takes advantages of both time-domain regression capability of the LSTM and the balanced ability in tracking accuracy and speed of Siamese network. It focus on the temporal and spatial correlation information between video sequences to improve the traditional Siamese-network trackers with an LSTM prediction module. In addition, an improved template updating module is constructed to combine the original template with the changed appearance. The proposed model is verified in two types of difficult scenarios: deformation challenge and occlusion challenge. Experimental results show that our proposed approach can get better performance in terms of tracking accuracy.

中文翻译：

长期记忆改进的连体网络，可进行可靠的目标跟踪

视觉目标跟踪是实时视频监控应用程序中的一项重要功能，其性能决定了许多高级任务的执行情况。目前，基于模板匹配的暹罗网络跟踪器具有巨大的潜力。由于预训练的卷积网络可以提取深度特征用于目标表示和每帧的离线跟踪，因此它具有在精度和速度之间取得平衡的优势。但是，在跟踪过程中，仅在现有算法中从视频的第一帧获得目标模板特征。跟踪性能完全取决于模板匹配的框架，从而导致帧的独立性，而忽略了视频序列的帧间连接特性。所以，面对较大的变形和严重的遮挡，现有算法效果不佳。我们提出了一个长短期记忆（LSTM）改进的暹罗网络（LSiam）模型，该模型同时利用了LSTM的时域回归能力和跟踪网络准确度和速度的平衡能力。它着重于视频序列之间的时间和空间相关性信息，以通过LSTM预测模块改进传统的暹罗网络跟踪器。此外，构建了一个改进的模板更新模块，以将原始模板与更改后的外观结合在一起。在两种类型的困难场景中验证了所提出的模型：变形挑战和遮挡挑战。实验结果表明，我们提出的方法在跟踪精度方面可以获得更好的性能。

更新日期：2021-02-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11