Abstract
Person search is a challenging computer vision task that handles and optimizes both pedestrian detection and person re-identification simultaneously. Person search is also closer to real-world applications compared to person re-identification. Existing person search works mainly focused on refining loss functions, using more complex network structures or redefining the person search as another task. However, few of them attempted to solve this problem from a feature representation perspective. In this paper, we embark on this point and present a novel method called FLAG to learn a better feature representation for person search. Specifically, partition pooling and cross-level feature hybridization are proposed to guide the model to learn more discriminative person features. Experiments show that the proposed method achieves encouraging performance improvement and outperforms similar end-to-end person search methods.
Similar content being viewed by others
References
Cai, Z., Saberian, M., Vasconcelos, N.: Learning complexity-aware cascades for deep pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3361–3369 (2015)
Chen, D., Zhang, S., Ouyang, W., Yang, J., Tai, Y.: Person search via a mask-guided two-stream cnn model. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750 (2018)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the iEEE conference on computer vision and pattern recognition, pp. 1335–1344 (2016)
Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 994–1003 (2018)
Ding, S., Lin, L., Wang, G., Chao, H.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recognit. 48(10), 2993–3003 (2015)
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)
Dollár, P., Tu, Z., Perona, P., Belongie, S.J.: Integral channel features. In: Cavallaro, A., Prince, S., Alexander, D.C. (eds) Proceedings of British Machine Vision Conference (BMVC), London, UK, 7–10 September 2009, pp. 1–11. British Machine Vision Association. https://doi.org/10.5244/C.23.91 (2009)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)
Gao, G., Yang, J., Jing, X.Y., Shen, F., Yang, W., Yue, D.: Learning robust and discriminative low-rank representations for face recognition with occlusion. Pattern Recognit. 66, 129–143 (2017)
Gao, G., Yu, Y., Yang, M., Chang, H., Huang, P., Yue, D.: Cross-resolution face recognition with pose variations via multilayer locality-constrained structural orthogonal procrustes regression. Inf. Sci. 506, 19–36 (2020)
Hosang, J., Omran, M., Benenson, R., Schiele, B.: Taking a deeper look at pedestrians. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4073–4082 (2015)
Huang, Q., Liu, W., Lin, D.: Person search in videos with one portrait through visual and temporal links. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 425–441 (2018)
Ji, Z., Li, S., Pang, Y.: Fusion-attention network for person search with free-form natural language. Pattern Recognit. Lett. 116, 205–211 (2018)
Lan, X., Zhu, X., Gong, S.: Person search by multi-scale matching. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 536–552 (2018)
Li, S., Xiao, T., Li, H., Zhou, B., Yue, D., Wang, X.: Person search with natural language description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1970–1979 (2017)
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 152–159 (2014)
Li, W.H., Mao, Y., Wu, A., Zheng, W.S.: Correlation based identity filter: An efficient framework for person search. In: International Conference on Image and Graphics, pp. 250–261. Springer (2017)
Liu, H., Feng, J., Jie, Z., Jayashree, K., Zhao, B., Qi, M., Jiang, J., Yan, S.: Neural person search machines. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 493–501 (2017)
Liu, H., Shi, W., Huang, W., Guan, Q.: A discriminatively learned feature embedding based on multi-loss fusion for person search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1668–1672. IEEE (2018)
Shi, W., Liu, H., Meng, F., Huang, W.: Instance enhancing loss: Deep identity-sensitive feature embedding for person search. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 4108–4112. IEEE (2018)
Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp. 1904–1912 (2015)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)
Xiao, J., Xie, Y., Tillo, T., Huang, K., Wei, Y., Feng, J.: Ian: the individual aggregation network for person search. Pattern Recognit. 87, 332–340 (2019)
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1249–1258 (2016)
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: End-to-end deep learning for person search. arXiv preprint arXiv:1604.018502, 2 (2016)
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3415–3424 (2017)
Xu, Y., Ma, B., Huang, R., Lin, L.: Person search in a scene by jointly modeling people commonness and person uniqueness. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 937–940. ACM (2014)
Yang, J., Chu, D., Zhang, L., Xu, Y., Yang, J.: Sparse representation classifier steered discriminative projection with applications to face recognition. IEEE Trans. Neur. Netw. Learn. Syst. 24(7), 1023–1035 (2013)
Yang, J., Wang, M., Li, M., Zhang, J.: Enhanced deep feature representation for person search. In: CCF Chinese Conference on Computer Vision, pp. 315–327. Springer (2017)
Yang, J., Zhang, L., Xu, Y., Yang, Jy: Beyond sparsity: the role of l1-optimizer in pattern classification. Pattern Recognit. 45(3), 1104–1118 (2012)
Ye, M., Li, J., Ma, A.J., Zheng, L., Yuen, P.C.: Dynamic graph co-matching for unsupervised video-based person re-identification. IEEE Trans. Image Process. 28(6), 2976–2990 (2019)
Ye, M., Zhang, X., Yuen, P.C., Chang, S.: Unsupervised embedding learning via invariant and spreading instance feature. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6203–6212 (2019)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39. IEEE (2014)
Zhang, S., Bauckhage, C., Cremers, A.B.: Informed haar-like features improve pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 947–954 (2014)
Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1751–1760. IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298784 (2015)
Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., Sun, J.: Alignedreid: Surpassing human-level performance in person re-identification. arXiv preprint arXiv:1711.08184 (2017)
Zhao, C., Wang, X., Wong, W.K., Zheng, W., Yang, J., Miao, D.: Multiple metric learning based on bar-shape descriptor for person re-identification. Pattern Recognit. 71, 218–234 (2017)
Zhao, C., Wang, X., Zuo, W., Shen, F., Shao, L., Miao, D.: Similarity learning with joint transfer constraints for person re-identification. Pattern Recognit. 97, 107014 (2020)
Zhao, Y., Shen, X., Jin, Z., Lu, H., Hua, X.s.: Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4913–4922 (2019)
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376 (2017)
Acknowledgements
The authors would like to thank the anonymous reviewers for their critical and constructive comments and suggestions. This work was supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61673299, 61203247, 61573259, and 61573255. This work was also supported by the Fundamental Research Funds for the Central Universities and the Open Project Program of the National Laboratory of Pattern Recognition (NLPR).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Z., Lv, X., Sun, T. et al. FLAG: feature learning with additional guidance for person search. Vis Comput 37, 685–693 (2021). https://doi.org/10.1007/s00371-020-01880-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-020-01880-y