当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tracking more than 100 arbitrary objects at 25 FPS through deep learning
Pattern Recognition ( IF 8 ) Pub Date : 2021-07-25 , DOI: 10.1016/j.patcog.2021.108205
Lorenzo Vaquero 1 , Víctor M. Brea 1 , Manuel Mucientes 1
Affiliation  

Most video analytics applications rely on object detectors to localize objects in frames. However, when real-time is a requirement, running the detector at all the frames is usually not possible. This is somewhat circumvented by instantiating visual object trackers between detector calls, but this does not scale with the number of objects. To tackle this problem, we present SiamMT, a new deep learning multiple visual object tracking solution that applies single-object tracking principles to multiple arbitrary objects in real-time. To achieve this, SiamMT reuses feature computations, implements a novel crop-and-resize operator, and defines a new and efficient pairwise similarity operator. SiamMT naturally scales up to several dozens of targets, reaching 25 fps with 122 simultaneous objects for VGA videos, or up to 100 simultaneous objects in HD720 video. SiamMT has been validated on five large real-time benchmarks, achieving leading performance against current state-of-the-art trackers.



中文翻译:

通过深度学习以 25 FPS 的速度跟踪 100 多个任意对象

大多数视频分析应用程序依赖对象检测器来定位帧中的对象。但是,当需要实时时,通常不可能在所有帧上运行检测器。这可以通过在检测器调用之间实例化视觉对象跟踪器来稍微规避,但这不会随着对象的数量而缩放。为了解决这个问题,我们提出了 SiamMT,这是一种新的深度学习多视觉对象跟踪解决方案,可将单对象跟踪原理实时应用于多个任意对象。为了实现这一点,SiamMT 重用了特征计算,实现了一种新颖的裁剪和调整大小算子,并定义了一个新的高效成对相似算子。SiamMT 自然地扩展到几十个目标,达到 25 fps,VGA 视频的 122 个同时对象,或 HD720 视频中多达 100 个同时对象。SiamMT 已在五个大型实时基准测试中得到验证,相对于当前最先进的跟踪器实现了领先的性能。

更新日期:2021-08-03
down
wechat
bug