GCA-Net: Gait contour automatic segmentation model for video gait recognition

Luo, Jun; Wu, Haonan; Lei, Lei; Wang, Huiyan; Yang, Tao

doi:10.1007/s11042-021-11248-6

GCA-Net: Gait contour automatic segmentation model for video gait recognition

1168: Deep Pattern Discovery for Big Multimedia Data
Published: 27 July 2021

Volume 81, pages 34295–34307, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jun Luo¹,
Haonan Wu²,
Lei Lei²,
Huiyan Wang² &
…
Tao Yang ORCID: orcid.org/0000-0002-3822-198X²

374 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Gait recognition from videos is a very important task for surveillance video analysis. Although a number of studies have explored gait recognition models, they lack clarity in the gait contour segmentation, which is an important but difficult step for automatic gait recognition. Most of the gait recognition algorithms use manually segmented gait contours, which is not available in real situations and not suited for real-time video processing applications. To date, there are very little research directly investigating automatic pedestrian gait contour segmentation. Current state-of-the-art instance segmentation methods fail to accurately describe the contour of whole pedestrian body and often deviate from the accurate boundaries, especially for the contour between two legs, which is the essential information for gait recognition. This paper presents a novel gait contour automatic segmentation model (GCA-Net) for gait recognition in videos. To improve the segmentation and edge fitting accuracy, we firstly use the dilated convolutions in the residual block to enhance the feature representative ability of the ResNet backbone, and then an edge detection module is added to the model which can make the predicted gait contour closer to the actual boundaries and therefore improve the edge fitting result. The experiment results show the effectiveness of the proposed method. The edge detection module can increase the performance by 5.4% and the residual block with dilated convolution can further increase the performance by 0.4%. More important, the proposed model can be directly integrated into existing gait recognition methods and automate video gait recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

References

Brejl M, Sonka M (1998) Edge-based image segmentation: machine learning from examples. In: 1998 IEEE International joint conference on neural networks proceedings. IEEE world congress on computational intelligence (cat. no. 98CH36227), vol. 2, pp. 814–819. IEEE
Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8126–8133
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40(4):834–848
Article Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp. 379–387
Farag AA (1992) Edge-based image segmentation. Remote sensing reviews 6(1):95–121
Article Google Scholar
Gao H, Yuan H, Wang Z, Ji S (2017) Pixel deconvolutional networks. arXiv preprint arXiv:1705.06820
Giusti A, Cireşan DC, Masci J, Gambardella LM, Schmidhuber J (2013) Fast image scanning with deep max-pooling convolutional neural networks. In: 2013 IEEE International conference on image processing, pp. 4034–4038. IEEE
Hamaguchi R, Fujita A, Nemoto K, Imaizumi T, Hikosaka S (2018) Effective use of dilated convolutions for segmenting small object instances in remote sensing imagery. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1442–1450. IEEE
He K, Gkioxari G, Dollar P, Girshick R (2018) Mask r-cnn. IEEE Transactions on Pattern Analysis and Machine Intelligence
HOU Y, GUO BL (2008) Motion segmentation based on graph theory. Journal of Jilin University (Engineering and Technology Edition), 4
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S et al (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7310–7311
Huang Z, Huang L, Gong Y, Huang C, Wang X (2019) Mask scoring r-cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6409–6418
Karthick S, Sathiyasekar K, Puraneeswari A (2014) A survey based on region based segmentation. International Journal of Engineering Trends and Technology 7:143–147
Article Google Scholar
Krähenbühl P., Koltun V (2011) Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in neural information processing systems, pp. 109–117
Li H, Zhao R, Wang X (2014) Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification. arXiv preprint arXiv:1412.4526
Lin TY, Dollár P., Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P., Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768
Ma Y, Huang M, Yang B, Zhu Q (2014) Automatic threshold method and optimal wavelength selection for insect-damaged vegetable soybean detection using hyperspectral images. Computers and electronics in agriculture 106:102–110
Article Google Scholar
Papandreou G, Kokkinos I, Savalle PA (2015) Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 390–399
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. nips-w. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp. 4–9
Prasad M, Zisserman A, Fitzgibbon A, Kumar MP, Torr PH (2006) Learning class-specific edges for object detection and segmentation. In: Computer vision, graphics and image processing, pp. 94–105. Springer
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99
Saranathan AM, Parente M (2013) Threshold based segmentation method for hyperspectral images. In: 2013 5Th workshop on hyperspectral image and signal processing: evolution in remote sensing (WHISPERS), pp. 1–4. IEEE
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat:, Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 39(4):640–651
Article Google Scholar
Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1451–1460. IEEE
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging 39(6):1856–1867
Article Google Scholar
Zimmermann RS, Siems JN (2019) Faster training of mask r-cnn by focusing on instance boundaries. Comput Vis Image Underst 102795:188
Google Scholar

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2018YFC0824406).

Author information

Authors and Affiliations

The Third Research Institute of the Ministry of Public Security, Shanghai, 200031, China
Jun Luo
School of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou, 310018, China
Haonan Wu, Lei Lei, Huiyan Wang & Tao Yang

Authors

Jun Luo
View author publications
You can also search for this author in PubMed Google Scholar
Haonan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Lei
View author publications
You can also search for this author in PubMed Google Scholar
Huiyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, J., Wu, H., Lei, L. et al. GCA-Net: Gait contour automatic segmentation model for video gait recognition. Multimed Tools Appl 81, 34295–34307 (2022). https://doi.org/10.1007/s11042-021-11248-6

Download citation

Received: 03 September 2020
Revised: 27 May 2021
Accepted: 08 July 2021
Published: 27 July 2021
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11042-021-11248-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GCA-Net: Gait contour automatic segmentation model for video gait recognition

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GCA-Net: Gait contour automatic segmentation model for video gait recognition

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation