Skip to main content
Log in

GCA-Net: Gait contour automatic segmentation model for video gait recognition

  • 1168: Deep Pattern Discovery for Big Multimedia Data
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Gait recognition from videos is a very important task for surveillance video analysis. Although a number of studies have explored gait recognition models, they lack clarity in the gait contour segmentation, which is an important but difficult step for automatic gait recognition. Most of the gait recognition algorithms use manually segmented gait contours, which is not available in real situations and not suited for real-time video processing applications. To date, there are very little research directly investigating automatic pedestrian gait contour segmentation. Current state-of-the-art instance segmentation methods fail to accurately describe the contour of whole pedestrian body and often deviate from the accurate boundaries, especially for the contour between two legs, which is the essential information for gait recognition. This paper presents a novel gait contour automatic segmentation model (GCA-Net) for gait recognition in videos. To improve the segmentation and edge fitting accuracy, we firstly use the dilated convolutions in the residual block to enhance the feature representative ability of the ResNet backbone, and then an edge detection module is added to the model which can make the predicted gait contour closer to the actual boundaries and therefore improve the edge fitting result. The experiment results show the effectiveness of the proposed method. The edge detection module can increase the performance by 5.4% and the residual block with dilated convolution can further increase the performance by 0.4%. More important, the proposed model can be directly integrated into existing gait recognition methods and automate video gait recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Brejl M, Sonka M (1998) Edge-based image segmentation: machine learning from examples. In: 1998 IEEE International joint conference on neural networks proceedings. IEEE world congress on computational intelligence (cat. no. 98CH36227), vol. 2, pp. 814–819. IEEE

  2. Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8126–8133

  3. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40(4):834–848

    Article  Google Scholar 

  4. Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  5. Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp. 379–387

  6. Farag AA (1992) Edge-based image segmentation. Remote sensing reviews 6(1):95–121

    Article  Google Scholar 

  7. Gao H, Yuan H, Wang Z, Ji S (2017) Pixel deconvolutional networks. arXiv preprint arXiv:1705.06820

  8. Giusti A, Cireşan DC, Masci J, Gambardella LM, Schmidhuber J (2013) Fast image scanning with deep max-pooling convolutional neural networks. In: 2013 IEEE International conference on image processing, pp. 4034–4038. IEEE

  9. Hamaguchi R, Fujita A, Nemoto K, Imaizumi T, Hikosaka S (2018) Effective use of dilated convolutions for segmenting small object instances in remote sensing imagery. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1442–1450. IEEE

  10. He K, Gkioxari G, Dollar P, Girshick R (2018) Mask r-cnn. IEEE Transactions on Pattern Analysis and Machine Intelligence

  11. HOU Y, GUO BL (2008) Motion segmentation based on graph theory. Journal of Jilin University (Engineering and Technology Edition), 4

  12. Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S et al (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7310–7311

  13. Huang Z, Huang L, Gong Y, Huang C, Wang X (2019) Mask scoring r-cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6409–6418

  14. Karthick S, Sathiyasekar K, Puraneeswari A (2014) A survey based on region based segmentation. International Journal of Engineering Trends and Technology 7:143–147

    Article  Google Scholar 

  15. Krähenbühl P., Koltun V (2011) Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in neural information processing systems, pp. 109–117

  16. Li H, Zhao R, Wang X (2014) Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification. arXiv preprint arXiv:1412.4526

  17. Lin TY, Dollár P., Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125

  18. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P., Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer

  19. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768

  20. Ma Y, Huang M, Yang B, Zhu Q (2014) Automatic threshold method and optimal wavelength selection for insect-damaged vegetable soybean detection using hyperspectral images. Computers and electronics in agriculture 106:102–110

    Article  Google Scholar 

  21. Papandreou G, Kokkinos I, Savalle PA (2015) Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 390–399

  22. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. nips-w. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp. 4–9

  23. Prasad M, Zisserman A, Fitzgibbon A, Kumar MP, Torr PH (2006) Learning class-specific edges for object detection and segmentation. In: Computer vision, graphics and image processing, pp. 94–105. Springer

  24. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99

  25. Saranathan AM, Parente M (2013) Threshold based segmentation method for hyperspectral images. In: 2013 5Th workshop on hyperspectral image and signal processing: evolution in remote sensing (WHISPERS), pp. 1–4. IEEE

  26. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat:, Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229

  27. Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 39(4):640–651

    Article  Google Scholar 

  28. Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483

  29. Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1451–1460. IEEE

  30. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122

  31. Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480

  32. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890

  33. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging 39(6):1856–1867

    Article  Google Scholar 

  34. Zimmermann RS, Siems JN (2019) Faster training of mask r-cnn by focusing on instance boundaries. Comput Vis Image Underst 102795:188

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2018YFC0824406).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, J., Wu, H., Lei, L. et al. GCA-Net: Gait contour automatic segmentation model for video gait recognition. Multimed Tools Appl 81, 34295–34307 (2022). https://doi.org/10.1007/s11042-021-11248-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11248-6

Keywords

Navigation