当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Camera domain adaptation based on cross-patch transformers for person re-identification
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2022-05-08 , DOI: 10.1016/j.patrec.2022.05.005
Zhidan Ran 1, 2 , Xiaobo Lu 1, 2
Affiliation  

As an essential task applied to video surveillance, person re-identification (Re-ID) suffers from variations across different cameras. In this paper we propose an effective transformer-based Re-ID framework for learning the identity-discriminative and camera-invariant feature representations. In contrast to the recent direction of using generative models to augment training data and enhance the invariance to input variations, we show that explicitly designing a novel adversarial loss from the perspective of feature representation learning helps to penalize the distribution discrepancy across multiple camera domains effectively. Recently, the pure transformer model has gained much attention due to its strong representation capabilities. We employ a pure transformer encoder to extract a global feature vector for the patch tokens of each person image. Notably, a novel cross-patch encoder is introduced to obtain structural information between image patches. Extensive experiments on three challenging datasets demonstrate the effectiveness and superiority of the proposed learning framework.



中文翻译:

基于交叉补丁变换器的相机域自适应用于行人重新识别

作为应用于视频监控的一项基本任务,行人重新识别 (Re-ID) 会受到不同摄像头的影响。在本文中,我们提出了一种有效的基于 Transformer 的 Re-ID 框架,用于学习身份判别和相机不变的特征表示。与最近使用生成模型来增加训练数据和增强输入变化的不变性的方向相反,我们表明,从特征表示学习的角度明确设计一种新颖的对抗性损失有助于有效地惩罚多个相机域之间的分布差异。最近,纯transformer模型由于其强大的表示能力而备受关注。我们使用纯变压器编码器来提取每个人图像的补丁标记的全局特征向量。值得注意的是,引入了一种新的交叉补丁编码器来获取图像补丁之间的结构信息。对三个具有挑战性的数据集的广泛实验证明了所提出的学习框架的有效性和优越性。

更新日期:2022-05-08
down
wechat
bug