Skip to main content
Log in

FLAG: feature learning with additional guidance for person search

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Person search is a challenging computer vision task that handles and optimizes both pedestrian detection and person re-identification simultaneously. Person search is also closer to real-world applications compared to person re-identification. Existing person search works mainly focused on refining loss functions, using more complex network structures or redefining the person search as another task. However, few of them attempted to solve this problem from a feature representation perspective. In this paper, we embark on this point and present a novel method called FLAG to learn a better feature representation for person search. Specifically, partition pooling and cross-level feature hybridization are proposed to guide the model to learn more discriminative person features. Experiments show that the proposed method achieves encouraging performance improvement and outperforms similar end-to-end person search methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Cai, Z., Saberian, M., Vasconcelos, N.: Learning complexity-aware cascades for deep pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3361–3369 (2015)

  2. Chen, D., Zhang, S., Ouyang, W., Yang, J., Tai, Y.: Person search via a mask-guided two-stream cnn model. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750 (2018)

  3. Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the iEEE conference on computer vision and pattern recognition, pp. 1335–1344 (2016)

  4. Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 994–1003 (2018)

  5. Ding, S., Lin, L., Wang, G., Chao, H.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recognit. 48(10), 2993–3003 (2015)

    Article  Google Scholar 

  6. Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)

    Article  Google Scholar 

  7. Dollár, P., Tu, Z., Perona, P., Belongie, S.J.: Integral channel features. In: Cavallaro, A., Prince, S., Alexander, D.C. (eds) Proceedings of British Machine Vision Conference (BMVC), London, UK, 7–10 September 2009, pp. 1–11. British Machine Vision Association. https://doi.org/10.5244/C.23.91 (2009)

  8. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  9. Gao, G., Yang, J., Jing, X.Y., Shen, F., Yang, W., Yue, D.: Learning robust and discriminative low-rank representations for face recognition with occlusion. Pattern Recognit. 66, 129–143 (2017)

    Article  Google Scholar 

  10. Gao, G., Yu, Y., Yang, M., Chang, H., Huang, P., Yue, D.: Cross-resolution face recognition with pose variations via multilayer locality-constrained structural orthogonal procrustes regression. Inf. Sci. 506, 19–36 (2020)

    Article  MathSciNet  Google Scholar 

  11. Hosang, J., Omran, M., Benenson, R., Schiele, B.: Taking a deeper look at pedestrians. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4073–4082 (2015)

  12. Huang, Q., Liu, W., Lin, D.: Person search in videos with one portrait through visual and temporal links. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 425–441 (2018)

  13. Ji, Z., Li, S., Pang, Y.: Fusion-attention network for person search with free-form natural language. Pattern Recognit. Lett. 116, 205–211 (2018)

    Article  Google Scholar 

  14. Lan, X., Zhu, X., Gong, S.: Person search by multi-scale matching. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 536–552 (2018)

  15. Li, S., Xiao, T., Li, H., Zhou, B., Yue, D., Wang, X.: Person search with natural language description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1970–1979 (2017)

  16. Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 152–159 (2014)

  17. Li, W.H., Mao, Y., Wu, A., Zheng, W.S.: Correlation based identity filter: An efficient framework for person search. In: International Conference on Image and Graphics, pp. 250–261. Springer (2017)

  18. Liu, H., Feng, J., Jie, Z., Jayashree, K., Zhao, B., Qi, M., Jiang, J., Yan, S.: Neural person search machines. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 493–501 (2017)

  19. Liu, H., Shi, W., Huang, W., Guan, Q.: A discriminatively learned feature embedding based on multi-loss fusion for person search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1668–1672. IEEE (2018)

  20. Shi, W., Liu, H., Meng, F., Huang, W.: Instance enhancing loss: Deep identity-sensitive feature embedding for person search. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 4108–4112. IEEE (2018)

  21. Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp. 1904–1912 (2015)

  22. Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)

  23. Xiao, J., Xie, Y., Tillo, T., Huang, K., Wei, Y., Feng, J.: Ian: the individual aggregation network for person search. Pattern Recognit. 87, 332–340 (2019)

    Article  Google Scholar 

  24. Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1249–1258 (2016)

  25. Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: End-to-end deep learning for person search. arXiv preprint arXiv:1604.018502, 2 (2016)

  26. Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3415–3424 (2017)

  27. Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 937–940. ACM (2014)

  28. Yang, J., Chu, D., Zhang, L., Xu, Y., Yang, J.: Sparse representation classifier steered discriminative projection with applications to face recognition. IEEE Trans. Neur. Netw. Learn. Syst. 24(7), 1023–1035 (2013)

    Article  Google Scholar 

  29. Yang, J., Wang, M., Li, M., Zhang, J.: Enhanced deep feature representation for person search. In: CCF Chinese Conference on Computer Vision, pp. 315–327. Springer (2017)

  30. Yang, J., Zhang, L., Xu, Y., Yang, Jy: Beyond sparsity: the role of l1-optimizer in pattern classification. Pattern Recognit. 45(3), 1104–1118 (2012)

    Article  Google Scholar 

  31. Ye, M., Li, J., Ma, A.J., Zheng, L., Yuen, P.C.: Dynamic graph co-matching for unsupervised video-based person re-identification. IEEE Trans. Image Process. 28(6), 2976–2990 (2019)

    Article  MathSciNet  Google Scholar 

  32. Ye, M., Zhang, X., Yuen, P.C., Chang, S.: Unsupervised embedding learning via invariant and spreading instance feature. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6203–6212 (2019)

  33. Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39. IEEE (2014)

  34. Zhang, S., Bauckhage, C., Cremers, A.B.: Informed haar-like features improve pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 947–954 (2014)

  35. Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1751–1760. IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298784 (2015)

  36. Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., Sun, J.: Alignedreid: Surpassing human-level performance in person re-identification. arXiv preprint arXiv:1711.08184 (2017)

  37. Zhao, C., Wang, X., Wong, W.K., Zheng, W., Yang, J., Miao, D.: Multiple metric learning based on bar-shape descriptor for person re-identification. Pattern Recognit. 71, 218–234 (2017)

    Article  Google Scholar 

  38. Zhao, C., Wang, X., Zuo, W., Shen, F., Shao, L., Miao, D.: Similarity learning with joint transfer constraints for person re-identification. Pattern Recognit. 97, 107014 (2020)

    Article  Google Scholar 

  39. Zhao, Y., Shen, X., Jin, Z., Lu, H., Hua, X.s.: Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4913–4922 (2019)

  40. Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376 (2017)

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their critical and constructive comments and suggestions. This work was supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61673299, 61203247, 61573259, and 61573255. This work was also supported by the Fundamental Research Funds for the Central Universities and the Open Project Program of the National Laboratory of Pattern Recognition (NLPR).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Cairong Zhao or Wei Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Lv, X., Sun, T. et al. FLAG: feature learning with additional guidance for person search. Vis Comput 37, 685–693 (2021). https://doi.org/10.1007/s00371-020-01880-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-020-01880-y

Keywords

Navigation