Skip to main content
Log in

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Visible thermal person re-identification, also known as RGB-infrared person re-identification, is an emerging cross-modality searching problem that identifies the same person from different modalities. To solve this problem, it is necessary to know what a person looks like in different modalities. Images of the same person at the same time from the same camera view in both modalities should be captured, so that similarities and differences could be discovered. However, existing datasets do not completely satisfy those requirements. Thus, a modality-transfer generative adversarial network is proposed to generate a cross-modality counterpart for a source image in the target modality, obtaining paired images for the same person. Given that query images are from one modality and gallery images are from another modality, it is necessary to produce a unified representation for both modalities so cross-modality matching could be performed. In this study, a novel dual-level unified latent representation is proposed for visible thermal person re-identification task, including an image-level patch fusion strategy and a feature-level hierarchical granularity triplet loss, producing a more general and robust unified feature embedding. Extensive experiments on both the SYSU-MM01 dataset (with visible and near-infrared images) and the RegDB dataset (with visible and far-infrared images) demonstrate the efficiency and generality of the proposed method, which achieves state-of-the-art performance. The code will be publicly released.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bai, X., Yang, M., Huang, T., Dou, Z., Yu, R., Xu, Y.: Deep-person: learning discriminative deep features for person re-identification. arXiv:1711.10658 (2017)

  2. Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  3. Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  4. Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: IJCAI, pp. 677–683 (2018)

  5. Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image–image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  6. Ding, S., Lin, L., Wang, G., Chao, H.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn. 48, 2993–3003 (2015)

    Article  Google Scholar 

  7. Fan, X., Jiang, W., Luo, H., Fei, M.: SphereReID: Deep hypersphere manifold embedding for person re-identification. ArXiv e-prints (2018)

  8. Ge, Y., Li, Z., Zhao, H., Yin, G., Yi, S., Wang, X., Li, H.: FD-GAN: pose-guided feature distilling GAN for robust person re-identification. In: Advances in Neural Information Processing Systems (2018)

  9. Giachetti, A., Isaia, L., Garro, V.: Multiscale descriptors and metric learning for human body shape retrieval. Vis. Comput. 32(6–8), 693–703 (2016)

    Article  Google Scholar 

  10. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

  11. Hao, Y., Wang, N., Li, J., Gao, X.: HSME: hypersphere manifold embedding for visible thermal person re-identification. Proc. AAAI Conf. Artif. Intell. 33, 8385–8392 (2019)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  13. Hermans, A., Beyer, L., Leibe, B.: Defense of the triplet loss for person re-identification. arXiv:1703.07737 (2017)

  14. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  15. Hong, K., Hong, S.: Real-time stress assessment using thermal imaging. Vis. Comput. 32(11), 1369–1377 (2016)

    Article  Google Scholar 

  16. Hou, X.N., Ding, S.H., Ma, L.Z., Wang, C.J., Li, J.L., Huang, F.Y.: Similarity metric learning for face verification using sigmoid decision function. Vis. Comput. 32(4), 479–490 (2016)

    Article  Google Scholar 

  17. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  18. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, Koray: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)

  19. Kabbai, L., Abdellaoui, M., Douik, A.: Image classification by combining local and global features. Vis. Comput. 35(5), 679–693 (2019)

    Article  Google Scholar 

  20. Khamis, S., Kuo, C.H., Singh, V.K., Shet, V.D., Davis, L.S.: Joint learning for attribute-consistent person re-identification. In: European Conference on Computer Vision (ECCV), pp. 134–146 (2014)

  21. Li, D., Chen, X., Zhang, Z., Huang, K.: Learning deep context-aware features over body and latent parts for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  22. Li, J., Yang, B., Yang, W., Sun, C., Xu, J.: Subspace-based multi-view fusion for instance-level image retrieval. Vis. Comput. (2020)

  23. Liu, J., Ni, B., Yan, Y., Zhou, P., Cheng, S., Hu, J.: Pose transferrable person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  24. Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)

  25. Nguyen, D., Hong, H., Kim, K., Park, K.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)

    Article  Google Scholar 

  26. Qian, X., Fu, Y., Xiang, T., Wang, W., Qiu, J., Wu, Y., Jiang, Y.G., Xue, X.: Pose-normalized image generation for person re-identification. In: The European Conference on Computer Vision (ECCV) (2018)

  27. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)

  28. Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: ICCV (2017)

  29. Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling. In: The European Conference on Computer Vision (ECCV) (2018)

  30. Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: European Conference on Computer Vision (ECCV), pp. 791–808 (2016)

  31. Varior, R.R., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: European Conference on Computer Vision (ECCV), pp. 135–153 (2016)

  32. Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: ACM Multimedia Conference on Multimedia (2018)

  33. Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: The IEEE International Conference on Computer Vision (ICCV) (2019)

  34. Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

  35. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. ArXiv e-prints (2017)

  36. Wei, L., Zhang, S., Yao, H., Gao, W., Tian, Q.: GLAD: global-local-alignment descriptor for pedestrian retrieval. In: ACM MM (2017)

  37. Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)

    MATH  Google Scholar 

  38. Wu, A., Zheng, W.S., Yu, H.X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

  39. Xia, B.N., Gong, Y., Zhang, Y., Poellabauer, C.: Second-order non-local attention networks for person re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2019)

  40. Ye, M., Lan, X., Li, J., Yuen, P.C.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

  41. Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: IJCAI, pp. 1092–1099 (2018)

  42. Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: Proceedings of International Conference on Pattern Recognition (ICPR), pp. 34–39 (2014)

  43. Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., Sun, J.: AlignedReID: surpassing human-level performance in person re-identification. arXiv:1711.08184 (2017)

  44. Zhao, H., Tian, M., Sun, S., Shao, J., Yan, J., Yi, S., Wang, X., Tang, X.: Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  45. Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

  46. Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: Mars: a video benchmark for large-scale person re-identification. In: European Conference on Computer Vision (ECCV) (2016)

  47. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: The IEEE International Conference on Computer Vision (ICCV) (2015)

  48. Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: past, present and future. arXiv:1610.02984 (2016)

  49. Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  50. Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

  51. Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. ArXiv e-prints arXiv:1708.04896 (2017)

  52. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

Download references

Funding

This work is supported by the National Natural Science Foundation of China (No. 61633019), the Public Projects of Zhejiang Province (No. LGF18F030002), and the Science Foundation of Chinese Aerospace Industry (JCKY2018204B053).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, X., Jiang, W., Luo, H. et al. Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification. Vis Comput 38, 279–294 (2022). https://doi.org/10.1007/s00371-020-02015-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-020-02015-z

Keywords

Navigation