当前位置: X-MOL 学术Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dual Siamese network for RGBT tracking via fusing predicted position maps
The Visual Computer ( IF 3.5 ) Pub Date : 2021-05-02 , DOI: 10.1007/s00371-021-02131-4
Chang Guo , Dedong Yang , Chang Li , Peng Song

Visual object tracking is a basic task in the field of computer vision. Despite the rapid development of visual object tracking, it is not reliable to use only visible light images for object tracking in some cases. Since visible light and thermal infrared images have complementary advantages in imaging, and the use of them as a joint input for tracking becomes more noted, this kind of tracking is RGBT tracking. The existing RGBT tracking can be divided into image-level fusion tracking, feature-level fusion tracking, and response-level fusion tracking. Compared with the first two, response-level fusion tracking can use deeper dual-mode image information, but most of them use traditional tracking methods and introduce weights at inappropriate stages. Based on the above, we propose a response-level fusion tracking algorithm that employed deep learning. And the weight distribution is placed in the feature extraction stage, for which we design the joint modal channel attention module. We adopt the Siamese framework and expand it into a dual Siamese subnetwork. In the meantime, we improve the regional proposal subnetwork and propose the strategy for fusing two modal predicted position maps. To verify the performance of our algorithm, we conducted experiments on two tracking benchmarks. After testing, our algorithm has very good performance and runs at 116 frames per second, which far exceeds the real-time requirement of 25 frames per second.



中文翻译:

双Siamese网络通过融合预测的位置图进行RGBT跟踪

视觉对象跟踪是计算机视觉领域中的一项基本任务。尽管视觉对象跟踪迅速发展,但在某些情况下仅将可见光图像用于对象跟踪并不可靠。由于可见光和热红外图像在成像方面具有互补的优势,并且越来越多地将它们用作跟踪的联合输入,因此这种跟踪是RGBT跟踪。现有的RGBT跟踪可以分为图像级融合跟踪,特征级融合跟踪和响应级融合跟踪。与前两种方法相比,响应级融合跟踪可以使用更深的双模图像信息,但是大多数方法都使用传统的跟踪方法,并在不适当的阶段引入权重。根据以上所述,我们提出了一种采用深度学习的响应级融合跟踪算法。权重分布置于特征提取阶段,为此我们设计了联合模态通道注意模块。我们采用Siamese框架,并将其扩展为双重Siamese子网。同时,我们改进了区域提案子网,并提出了融合两个模态预测位置图的策略。为了验证我们算法的性能,我们在两个跟踪基准上进行了实验。经过测试,我们的算法具有很好的性能,并以每秒116帧的速度运行,远远超过了每秒25帧的实时要求。我们采用Siamese框架,并将其扩展为双重Siamese子网。同时,我们改进了区域提案子网,并提出了融合两个模态预测位置图的策略。为了验证我们算法的性能,我们在两个跟踪基准上进行了实验。经过测试,我们的算法具有很好的性能,并以每秒116帧的速度运行,远远超过了每秒25帧的实时要求。我们采用Siamese框架,并将其扩展为双重Siamese子网。同时,我们改进了区域提案子网,并提出了融合两个模态预测位置图的策略。为了验证我们算法的性能,我们在两个跟踪基准上进行了实验。经过测试,我们的算法具有非常好的性能,并以每秒116帧的速度运行,远远超过了每秒25帧的实时要求。

更新日期:2021-05-02
down
wechat
bug