当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Model-free tracker for multiple objects using joint appearance and motion inference.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-07-17 , DOI: 10.1109/tip.2019.2928123
Chongyu Liu , Rui Yao , S. Hamid Rezatofighi , Ian Reid , Qinfeng Shi

Model-free tracking is a widely-accepted approach to track an arbitrary object in a video using a single frame annotation with no further prior knowledge about the object of interest. Extending this problem to track multiple objects is really challenging because: a) the tracker is not aware of the objects' type while trying to distinguish them from background (detection task), and b) The tracker needs to distinguish one object from other potentially similar objects (data association task) to generate stable trajectories. In order to track multiple arbitrary objects, most existing model-free tracking approaches rely on tracking each target individually by updating their appearance model independently. Therefore, in this scenario they often fail to perform well due to confusion between the appearance of similar objects, their sudden appearance changes and occlusion. To tackle this problem, we propose to use both appearance and motion models, and to learn them jointly using graphical models and deep neural networks features. We introduce an indicator variable to predict sudden appearance change and/or occlusion. When these happen, our model does not update the appearance model thus avoiding using the background and/or incorrect object to update the appearance of the object of interest mistakenly, and relies on our motion model to track. Moreover, we consider the correlation among all targets, and seek the joint optimal locations for all targets simultaneously as a graphical model inference problem. We learn the joint parameters for both appearance model and motion model in an online fashion under the framework of LaRank. Experiment results show that our method achieved superior performance compared to the competitive methods.

中文翻译:


使用关节外观和运动推理的多个对象的无模型跟踪器。



无模型跟踪是一种广泛接受的方法,可以使用单帧注释来跟踪视频中的任意对象,而无需进一步了解感兴趣对象的先验知识。将这个问题扩展到跟踪多个对象确实具有挑战性,因为:a)跟踪器在尝试将对象与背景(检测任务)区分开时不知道对象的类型,b)跟踪器需要将一个对象与其他可能相似的对象区分开来对象(数据关联任务)来生成稳定的轨迹。为了跟踪多个任意对象,大多数现有的无模型跟踪方法依赖于通过独立更新其外观模型来单独跟踪每个目标。因此,在这种情况下,由于相似物体的外观、突然的外观变化和遮挡之间的混淆,它们常常无法表现良好。为了解决这个问题,我们建议同时使用外观和运动模型,并使用图形模型和深度神经网络特征联合学习它们。我们引入一个指示变量来预测突然的外观变化和/或遮挡。当这些发生时,我们的模型不会更新外观模型,从而避免使用背景和/或不正确的对象错误地更新感兴趣对象的外观,并依赖我们的运动模型来跟踪。此外,我们考虑所有目标之间的相关性,并同时寻求所有目标的联合最优位置作为图模型推理问题。我们在 LaRank 框架下以在线方式学习外观模型和运动模型的联合参数。实验结果表明,与竞争方法相比,我们的方法取得了更优异的性能。
更新日期:2020-04-22
down
wechat
bug