Abstract
Gait recognition from videos is a very important task for surveillance video analysis. Although a number of studies have explored gait recognition models, they lack clarity in the gait contour segmentation, which is an important but difficult step for automatic gait recognition. Most of the gait recognition algorithms use manually segmented gait contours, which is not available in real situations and not suited for real-time video processing applications. To date, there are very little research directly investigating automatic pedestrian gait contour segmentation. Current state-of-the-art instance segmentation methods fail to accurately describe the contour of whole pedestrian body and often deviate from the accurate boundaries, especially for the contour between two legs, which is the essential information for gait recognition. This paper presents a novel gait contour automatic segmentation model (GCA-Net) for gait recognition in videos. To improve the segmentation and edge fitting accuracy, we firstly use the dilated convolutions in the residual block to enhance the feature representative ability of the ResNet backbone, and then an edge detection module is added to the model which can make the predicted gait contour closer to the actual boundaries and therefore improve the edge fitting result. The experiment results show the effectiveness of the proposed method. The edge detection module can increase the performance by 5.4% and the residual block with dilated convolution can further increase the performance by 0.4%. More important, the proposed model can be directly integrated into existing gait recognition methods and automate video gait recognition.
Similar content being viewed by others
References
Brejl M, Sonka M (1998) Edge-based image segmentation: machine learning from examples. In: 1998 IEEE International joint conference on neural networks proceedings. IEEE world congress on computational intelligence (cat. no. 98CH36227), vol. 2, pp. 814–819. IEEE
Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8126–8133
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40(4):834–848
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp. 379–387
Farag AA (1992) Edge-based image segmentation. Remote sensing reviews 6(1):95–121
Gao H, Yuan H, Wang Z, Ji S (2017) Pixel deconvolutional networks. arXiv preprint arXiv:1705.06820
Giusti A, Cireşan DC, Masci J, Gambardella LM, Schmidhuber J (2013) Fast image scanning with deep max-pooling convolutional neural networks. In: 2013 IEEE International conference on image processing, pp. 4034–4038. IEEE
Hamaguchi R, Fujita A, Nemoto K, Imaizumi T, Hikosaka S (2018) Effective use of dilated convolutions for segmenting small object instances in remote sensing imagery. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1442–1450. IEEE
He K, Gkioxari G, Dollar P, Girshick R (2018) Mask r-cnn. IEEE Transactions on Pattern Analysis and Machine Intelligence
HOU Y, GUO BL (2008) Motion segmentation based on graph theory. Journal of Jilin University (Engineering and Technology Edition), 4
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S et al (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7310–7311
Huang Z, Huang L, Gong Y, Huang C, Wang X (2019) Mask scoring r-cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6409–6418
Karthick S, Sathiyasekar K, Puraneeswari A (2014) A survey based on region based segmentation. International Journal of Engineering Trends and Technology 7:143–147
Krähenbühl P., Koltun V (2011) Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in neural information processing systems, pp. 109–117
Li H, Zhao R, Wang X (2014) Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification. arXiv preprint arXiv:1412.4526
Lin TY, Dollár P., Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P., Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768
Ma Y, Huang M, Yang B, Zhu Q (2014) Automatic threshold method and optimal wavelength selection for insect-damaged vegetable soybean detection using hyperspectral images. Computers and electronics in agriculture 106:102–110
Papandreou G, Kokkinos I, Savalle PA (2015) Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 390–399
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. nips-w. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp. 4–9
Prasad M, Zisserman A, Fitzgibbon A, Kumar MP, Torr PH (2006) Learning class-specific edges for object detection and segmentation. In: Computer vision, graphics and image processing, pp. 94–105. Springer
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99
Saranathan AM, Parente M (2013) Threshold based segmentation method for hyperspectral images. In: 2013 5Th workshop on hyperspectral image and signal processing: evolution in remote sensing (WHISPERS), pp. 1–4. IEEE
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat:, Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 39(4):640–651
Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 1451–1460. IEEE
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 472–480
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging 39(6):1856–1867
Zimmermann RS, Siems JN (2019) Faster training of mask r-cnn by focusing on instance boundaries. Comput Vis Image Underst 102795:188
Acknowledgements
This work is supported by the National Key Research and Development Program of China (No. 2018YFC0824406).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Luo, J., Wu, H., Lei, L. et al. GCA-Net: Gait contour automatic segmentation model for video gait recognition. Multimed Tools Appl 81, 34295–34307 (2022). https://doi.org/10.1007/s11042-021-11248-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11248-6