当前位置: X-MOL 学术Int. J. Intell. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing the association in multi-object tracking via neighbor graph
International Journal of Intelligent Systems ( IF 7 ) Pub Date : 2021-07-26 , DOI: 10.1002/int.22565
Tianyi Liang 1, 2 , Long Lan 3 , Xiang Zhang 3 , Xindong Peng 1 , Zhigang Luo 4
Affiliation  

Most modern multi-object tracking (MOT) systems for videos follow the tracking-by-detection paradigm, where objects of interest are first located in each frame then associated correspondingly to form their intact trajectories. In this setting, the appearance features of objects usually provide the most important cues for data association, but it is very susceptible to occlusions, illumination variations, and inaccurate detections, thus easily resulting in incorrect trajectories. To address this issue, in this study we propose to make full use of the neighboring information. Our motivations derive from the observations that people tend to move in a group. As such, when an individual target's appearance is remarkably changed, the observer can still identify it with its neighbor context. To model the contextual information from neighbors, we first utilize the spatiotemporal relations among trajectories to efficiently select suitable neighbors for targets. Subsequently, we construct neighbor graph for each target and corresponding neighbors then employ the graph convolutional networks (GCNs) to model their relations and learn the graph features. To the best of our knowledge, it is the first time to explicitly leverage neighbor cues via GCN in MOT. Finally, standardized evaluations on the MOT16 and MOT17 data sets demonstrate that our approach can remarkably reduce the identity switches whilst achieve state-of-the-art overall performance.

中文翻译:

通过邻居图增强多目标跟踪中的关联

大多数现代视频多目标跟踪 (MOT) 系统遵循逐检测跟踪范式,其中感兴趣的对象首先位于每一帧中,然后相应地关联以形成其完整的轨迹。在这种情况下,物体的外观特征通常为数据关联提供最重要的线索,但它很容易受到遮挡、光照变化和不准确检测的影响,从而容易导致不正确的轨迹。为了解决这个问题,在本研究中,我们建议充分利用相邻信息。我们的动机来自人们倾向于在群体中移动的观察结果。因此,当单个目标的外观发生显着变化时,观察者仍然可以通过其相邻环境识别它。为了模拟来自邻居的上下文信息,我们首先利用轨迹之间的时空关系来有效地为目标选择合适的邻居。随后,我们为每个目标和相应的邻居构建邻居图,然后使用图卷积网络 (GCN) 对它们的关系进行建模并学习图特征。据我们所知,这是第一次在 MOT 中通过 GCN 明确利用邻居线索。最后,对 MOT16 和 MOT17 数据集的标准化评估表明,我们的方法可以显着减少身份转换,同时实现最先进的整体性能。我们为每个目标和相应的邻居构建邻居图,然后使用图卷积网络(GCN)来建模它们的关系并学习图特征。据我们所知,这是第一次在 MOT 中通过 GCN 明确利用邻居线索。最后,对 MOT16 和 MOT17 数据集的标准化评估表明,我们的方法可以显着减少身份转换,同时实现最先进的整体性能。我们为每个目标和相应的邻居构建邻居图,然后使用图卷积网络(GCN)来建模它们的关系并学习图特征。据我们所知,这是第一次在 MOT 中通过 GCN 明确利用邻居线索。最后,对 MOT16 和 MOT17 数据集的标准化评估表明,我们的方法可以显着减少身份转换,同时实现最先进的整体性能。
更新日期:2021-09-24
down
wechat
bug