当前位置: X-MOL 学术J. Sign. Process. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Visual tracking based on depth cross-correlation and feature alignment
Journal of Signal Processing Systems ( IF 1.6 ) Pub Date : 2022-09-03 , DOI: 10.1007/s11265-022-01791-2
Guang Han , Yao Xiao , Fuxiang Wang , Xuhui Liu

Visual tracking technology based on the Siamese network have enabled excellent performance on many tracking datasets. However, these trackers cannot provide desirable results in unconstrained environments, such as fast motion and extensive scale variations. To solve this problem, this paper proposes Adaptive Dilated Fusion module, Depth Pixel-Wise Correlation module and Feature Alignment module to meet the above challenges. Adaptive Dilated Fusion module facilitates extensive scale variations by adding receptive field pyramid on the last layer of Siamese network; Depth Pixel-Wise Correlation module aims to extract pixel level features through average pooling and maximum pooling and reduce the influence of background noise; Feature Alignment module alleviates the mismatch between classification task and regression task. Experiments are performed on several public datasets VOT2017, OTB100, LaSOT, etc. The tracking performance of algorithm is tested on complex scenes such as fast motion, various resolutions and extensive scale variations. On the OTB100 dataset, the tracker proposed in this paper (named SiamAPA) obtains up 2.4% (AUC) compared with the reference network on fast motion scene, 4.9% on various resolution scene and 1.3% on extensive scale variations scene. On the VOT2017 dataset, SiamAPA obtains up 3.7% (EAO) compared with the reference network. On the LaSOT dataset, the accuracy is improved by 1% and the robustness is improved by 1.9% compared with the reference network. Thanks to the coordination of the above three innovations, the proposed algorithm is superior to classical algorithms such as SPM tracker in many datasets while performs real-time tracking effect.



中文翻译:

基于深度互相关和特征对齐的视觉跟踪

基于 Siamese 网络的视觉跟踪技术在许多跟踪数据集上实现了出色的性能。然而,这些跟踪器无法在不受约束的环境中提供理想的结果,例如快速运动和广泛的尺度变化。为了解决这个问题,本文提出了 Adaptive Dilated Fusion 模块、Depth Pixel-Wise Correlation 模块和 Feature Alignment 模块来应对上述挑战。自适应扩张融合模块通过在连体网络的最后一层添加感受野金字塔来促进广泛的尺度变化;Depth Pixel-Wise Correlation 模块旨在通过平均池化和最大池化提取像素级特征,减少背景噪声的影响;特征对齐模块缓解了分类任务和回归任务之间的不匹配。在几个公共数据集 VOT2017、OTB100、LaSOT 等上进行了实验。在快速运动、各种分辨率和广泛的尺度变化等复杂场景上测试了算法的跟踪性能。在 OTB100 数据集上,本文提出的跟踪器(名为 SiamAPA)与参考网络相比,在快速运动场景上获得了 2.4%(AUC)的提升,在各种分辨率场景上提升了 4.9%,在大尺度变化场景下提升了 1.3%。在 VOT2017 数据集上,SiamAPA 与参考网络相比提高了 3.7% (EAO)。在 LaSOT 数据集上,与参考网络相比,准确率提高了 1%,鲁棒性提高了 1.9%。得益于以上三项创新的协调,

更新日期:2022-09-03
down
wechat
bug