Context-Interactive CNN for Person Re-Identification.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Context-Interactive CNN for Person Re-Identification.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-11-20 , DOI: 10.1109/tip.2019.2953587
Wenfeng Song , Shuai Li , Tao Chang , Aimin Hao , Qinping Zhao , Hong Qin

Despite growing progresses in recent years, cross-scenario person re-identification remains challenging, mainly due to the pedestrians commonly surrounded by highly-complex environment contexts. In reality, the human perception mechanism could adaptively find proper contextualized spatial-temporal clues towards pedestrian recognition. However, conventional methods fall short in adaptively leveraging the long-term spatial-temporal information due to ever-increasing computational cost. Moreover, CNN-based deep learning methods are hard to conduct optimization due to the non-differentiable property of the built-in context search operation. To ameliorate, this paper proposes a novel Context-Interactive CNN (CI-CNN) to dynamically find both spatial and temporal contexts by embedding multi-task Reinforcement Learning (MTRL). The CI-CNN streamlines the multi-task reinforcement learning by using an actor-critic agent to capture the temporal-spatial context simultaneously, which comprises a context-policy network and a context-critic network. The former network learns policies to determine the optimal spatial context region and temporal sequence range. Based on the inferred temporal-spatial cues, the latter one focuses on the identification task and provides feedback for the policy network. Thus, CI-CNN can simultaneously zoom in/out the perception field in spatial and temporal domain for the context interaction with the environment. By fostering the collaborative interaction between the person and context, our method could achieve outstanding performance on various public benchmarks, which confirms the rationality of our hypothesis, and verifies the effectiveness of our CI-CNN framework.

中文翻译：

用于人员重新识别的上下文交互式 CNN。

尽管近年来取得了不断进步，但跨场景行人重新识别仍然具有挑战性，这主要是由于行人通常被高度复杂的环境所包围。事实上，人类感知机制可以自适应地找到适当的情境化时空线索来识别行人。然而，由于计算成本不断增加，传统方法无法自适应地利用长期时空信息。此外，由于内置上下文搜索操作的不可微性质，基于 CNN 的深度学习方法很难进行优化。为了改善这一问题，本文提出了一种新颖的上下文交互式 CNN（CI-CNN），通过嵌入多任务强化学习（MTRL）来动态查找空间和时间上下文。 CI-CNN 通过使用 actor-critic 代理同时捕获时空上下文来简化多任务强化学习，其中包括上下文策略网络和上下文批评网络。前一个网络学习策略来确定最佳空间上下文区域和时间序列范围。基于推断的时空线索，后者专注于识别任务并为政策网络提供反馈。因此，CI-CNN 可以同时放大/缩小时空域的感知场，以实现与环境的上下文交互。通过促进人与环境之间的协作互动，我们的方法可以在各种公共基准上取得出色的性能，这证实了我们假设的合理性，并验证了我们的 CI-CNN 框架的有效性。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11