Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Fan, Xing; Jiang, Wei; Luo, Hao; Mao, Weijie

doi:10.1007/s00371-020-02015-z

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Original article
Published: 24 November 2020

Volume 38, pages 279–294, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xing Fan¹,
Wei Jiang ORCID: orcid.org/0000-0002-9240-5851¹,
Hao Luo¹ &
…
Weijie Mao¹

781 Accesses
18 Citations
Explore all metrics

Abstract

Visible thermal person re-identification, also known as RGB-infrared person re-identification, is an emerging cross-modality searching problem that identifies the same person from different modalities. To solve this problem, it is necessary to know what a person looks like in different modalities. Images of the same person at the same time from the same camera view in both modalities should be captured, so that similarities and differences could be discovered. However, existing datasets do not completely satisfy those requirements. Thus, a modality-transfer generative adversarial network is proposed to generate a cross-modality counterpart for a source image in the target modality, obtaining paired images for the same person. Given that query images are from one modality and gallery images are from another modality, it is necessary to produce a unified representation for both modalities so cross-modality matching could be performed. In this study, a novel dual-level unified latent representation is proposed for visible thermal person re-identification task, including an image-level patch fusion strategy and a feature-level hierarchical granularity triplet loss, producing a more general and robust unified feature embedding. Extensive experiments on both the SYSU-MM01 dataset (with visible and near-infrared images) and the RegDB dataset (with visible and far-infrared images) demonstrate the efficiency and generality of the proposed method, which achieves state-of-the-art performance. The code will be publicly released.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visible-infrared person re-identification model based on feature consistency and modal indistinguishability

Article 27 December 2022

Jia Sun, Yanfeng Li, … Jinlei Zhu

Learning Deep RGBT Representations for Robust Person Re-identification

Article 19 January 2021

Ai-Hua Zheng, Zi-Han Chen, … Bin Luo

ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-identification in Multispectral Dataset

References

Bai, X., Yang, M., Huang, T., Dou, Z., Yu, R., Xu, Y.: Deep-person: learning discriminative deep features for person re-identification. arXiv:1711.10658 (2017)
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: IJCAI, pp. 677–683 (2018)
Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image–image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Ding, S., Lin, L., Wang, G., Chao, H.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn. 48, 2993–3003 (2015)
Article Google Scholar
Fan, X., Jiang, W., Luo, H., Fei, M.: SphereReID: Deep hypersphere manifold embedding for person re-identification. ArXiv e-prints (2018)
Ge, Y., Li, Z., Zhao, H., Yin, G., Yi, S., Wang, X., Li, H.: FD-GAN: pose-guided feature distilling GAN for robust person re-identification. In: Advances in Neural Information Processing Systems (2018)
Giachetti, A., Isaia, L., Garro, V.: Multiscale descriptors and metric learning for human body shape retrieval. Vis. Comput. 32(6–8), 693–703 (2016)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Hao, Y., Wang, N., Li, J., Gao, X.: HSME: hypersphere manifold embedding for visible thermal person re-identification. Proc. AAAI Conf. Artif. Intell. 33, 8385–8392 (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hermans, A., Beyer, L., Leibe, B.: Defense of the triplet loss for person re-identification. arXiv:1703.07737 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Hong, K., Hong, S.: Real-time stress assessment using thermal imaging. Vis. Comput. 32(11), 1369–1377 (2016)
Article Google Scholar
Hou, X.N., Ding, S.H., Ma, L.Z., Wang, C.J., Li, J.L., Huang, F.Y.: Similarity metric learning for face verification using sigmoid decision function. Vis. Comput. 32(4), 479–490 (2016)
Article Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, Koray: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)
Kabbai, L., Abdellaoui, M., Douik, A.: Image classification by combining local and global features. Vis. Comput. 35(5), 679–693 (2019)
Article Google Scholar
Khamis, S., Kuo, C.H., Singh, V.K., Shet, V.D., Davis, L.S.: Joint learning for attribute-consistent person re-identification. In: European Conference on Computer Vision (ECCV), pp. 134–146 (2014)
Li, D., Chen, X., Zhang, Z., Huang, K.: Learning deep context-aware features over body and latent parts for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Li, J., Yang, B., Yang, W., Sun, C., Xu, J.: Subspace-based multi-view fusion for instance-level image retrieval. Vis. Comput. (2020)
Liu, J., Ni, B., Yan, Y., Zhou, P., Cheng, S., Hu, J.: Pose transferrable person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
Nguyen, D., Hong, H., Kim, K., Park, K.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)
Article Google Scholar
Qian, X., Fu, Y., Xiang, T., Wang, W., Qiu, J., Wu, Y., Jiang, Y.G., Xue, X.: Pose-normalized image generation for person re-identification. In: The European Conference on Computer Vision (ECCV) (2018)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)
Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: ICCV (2017)
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling. In: The European Conference on Computer Vision (ECCV) (2018)
Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: European Conference on Computer Vision (ECCV), pp. 791–808 (2016)
Varior, R.R., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: European Conference on Computer Vision (ECCV), pp. 135–153 (2016)
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: ACM Multimedia Conference on Multimedia (2018)
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. ArXiv e-prints (2017)
Wei, L., Zhang, S., Yao, H., Gao, W., Tian, Q.: GLAD: global-local-alignment descriptor for pedestrian retrieval. In: ACM MM (2017)
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
MATH Google Scholar
Wu, A., Zheng, W.S., Yu, H.X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
Xia, B.N., Gong, Y., Zhang, Y., Poellabauer, C.: Second-order non-local attention networks for person re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Ye, M., Lan, X., Li, J., Yuen, P.C.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: IJCAI, pp. 1092–1099 (2018)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: Proceedings of International Conference on Pattern Recognition (ICPR), pp. 34–39 (2014)
Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., Sun, J.: AlignedReID: surpassing human-level performance in person re-identification. arXiv:1711.08184 (2017)
Zhao, H., Tian, M., Sun, S., Shao, J., Yan, J., Yi, S., Wang, X., Tang, X.: Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: Mars: a video benchmark for large-scale person re-identification. In: European Conference on Computer Vision (ECCV) (2016)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: past, present and future. arXiv:1610.02984 (2016)
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. ArXiv e-prints arXiv:1708.04896 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

Download references

Funding

This work is supported by the National Natural Science Foundation of China (No. 61633019), the Public Projects of Zhejiang Province (No. LGF18F030002), and the Science Foundation of Chinese Aerospace Industry (JCKY2018204B053).

Author information

Authors and Affiliations

The State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, 310027, China
Xing Fan, Wei Jiang, Hao Luo & Weijie Mao

Authors

Xing Fan
View author publications
You can also search for this author in PubMed Google Scholar
Wei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Weijie Mao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, X., Jiang, W., Luo, H. et al. Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification. Vis Comput 38, 279–294 (2022). https://doi.org/10.1007/s00371-020-02015-z

Download citation

Accepted: 04 November 2020
Published: 24 November 2020
Issue Date: January 2022
DOI: https://doi.org/10.1007/s00371-020-02015-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Abstract

Access this article

Similar content being viewed by others

Visible-infrared person re-identification model based on feature consistency and modal indistinguishability

Learning Deep RGBT Representations for Robust Person Re-identification

ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-identification in Multispectral Dataset

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Abstract

Access this article

Similar content being viewed by others

Visible-infrared person re-identification model based on feature consistency and modal indistinguishability

Learning Deep RGBT Representations for Robust Person Re-identification

ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-identification in Multispectral Dataset

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation