View-independent representation with frame interpolation method for skeleton-based human action recognition

Jiang, Yingguo; Xu, Jun; Zhang, Tong

doi:10.1007/s13042-020-01132-4

View-independent representation with frame interpolation method for skeleton-based human action recognition

Original Article
Published: 05 May 2020

Volume 11, pages 2625–2636, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

625 Accesses
142 Citations
Explore all metrics

Abstract

Human action recognition is an important branch of computer vision science. It is a challenging task based on skeletal data because of joints’ complex spatiotemporal information. In this work, we propose a method for action recognition, which consists of three parts: view-independent representation, frame interpolation, and combined model. First, the action sequence becomes view-independent representations independent of the view. Second, when judgment conditions are met, differentiated frame interpolations are used to expand the temporal dimensional information. Then, a combined model is adopted to extract these representation features and classify actions. Experimental results on two multi-view benchmark datasets Northwestern-UCLA and NTU RGB+D demonstrate the effectiveness of our complete method. Although using only one type of action feature and a simple architecture combined model, our complete method still outperforms most of the referential state-of-the-art methods and has strong robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

View Invariant Human Action Recognition Using 3D Geometric Features

Viewpoint guided multi-stream neural network for skeleton action recognition

Article 17 June 2023

Yicheng He, Zixi Liang, … Ming Yin

Rotation-based spatial–temporal feature learning from skeleton sequences for action recognition

Article 02 March 2020

Xing Liu, Yanshan Li & Rongjie Xia

References

He W, Li Z, Chen CP (2017) A survey of human-centered intelligent robots: issues and challenges. IEEE/CAA J Autom Sin 4(4):602–609
Article Google Scholar
Zhang T, Wang X, Xu X, Chen CP (2019) Gcb-net: graph convolutional broad network and its application in emotion recognition. IEEE Trans Affect Comput
Song Y, Liu S, Tang J (2014) Describing trajectory of surface patch for human action recognition on rgb and depth videos. IEEE Signal Process Lett 22(4):426–429
Article Google Scholar
Zhang S, McCullagh P, Nugent C, Zheng H, Baumgarten M (2011) Optimal model selection for posture recognition in home-based healthcare. Int J Mach Learn Cybern 2(1):1–14
Article Google Scholar
Ijjina EP, Chalavadi KM (2017) Human action recognition in rgb-d videos using motion sequence information and deep learning. Pattern Recognit 72:504–516
Article Google Scholar
Naveed H, Khan G, Khan AU, Siddiqi A, Khan MUG (2019) Human activity recognition using mixture of heterogeneous features and sequential minimal optimization. Int J Mach Learn Cybern 10(9):2329–2340
Article Google Scholar
Arshad H, Khan MA, Sharif M, Yasmin M, Javed MY (2019) Multi-level features fusion and selection for human gait recognition: an optimized framework of Bayesian model and binomial distribution. Int J Mach Learn Cybern 10(12):3601–3618
Article Google Scholar
Zhao Q, Tsai CM, Chen RC, Huang CY (2019) Resident activity recognition based on binary infrared sensors and soft computing. Int J Mach Learn Cybern 10(2):291–299
Article Google Scholar
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 1290–1297
Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Twenty-third international joint conference on artificial intelligence
Chen C, Jafari R, Kehtarnavaz N (2014) Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Trans Hum Mach Syst 45(1):51–61
Article Google Scholar
Wasenmüller O, Stricker D (2016) Comparison of kinect v1 and v2 depth images in terms of accuracy and precision. In: Asian conference on computer vision. Springer, pp 34–45
Hara K, Kataoka H, Satoh Y (2018) Can spatiotemporal 3d CNNS retrace the history of 2D CNNS and imagenet? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6546–6555
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Zhang P, Lan C, Xing J, Zeng W, Xue J, Zheng N (2019) View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans Pattern Anal Mach Intell 41(8):1963–1978
Article Google Scholar
Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346–362
Article Google Scholar
Papadakis A, Mathe E, Spyrou E, Mylonas P (2019) A geometric approach for cross-view human action recognition using deep learning. In: 2019 11th international symposium on image and signal processing and analysis (ISPA). IEEE, pp 258–263
Song S, Lan C, Xing J, Zeng W, Liu J (2017) An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In: Thirty-first AAAI conference on artificial intelligence
Si C, Chen W, Wang W, Wang L, Tan T (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1227–1236
Yang Z, Li Y, Yang J, Luo J (2018) Action recognition with visual attention on skeleton images. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 3309–3314
Shoemake K (1985) Animating rotation with quaternion curves. In: Proceedings of the 12th annual conference on computer graphics and interactive techniques, pp 245–254
Zhao Y, Gao L, He D, Guo H, Wang H, Zheng J, Yang X (2019) Multi-feature fusion action recognition based on key frames. In: 2019 seventh international conference on advanced cloud and big data (CBD). IEEE, pp 279–284
Xu Y, Hou Z, Liang J, Chen C, Jia L, Song Y (2019) Action recognition using weighted fusion of depth images and skeleton’s key frames. Multimed Tools Appl 78:1–16
Article Google Scholar
Xiao R, Hou Y, Guo Z, Li C, Wang P, Li W (2019) Self-attention guided deep features for action recognition. In: 2019 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1060–1065
Dietterich TG, Kong EB (1995) Machine learning bias, statistical bias, and statistical variance of decision tree algorithms. Technical report, Department of Computer Science, Oregon State University
Zinbarg RE, Mineka S, Craske MG, Griffith JW, Sutton J, Rose RD, Nazarian M, Mor N, Waters AM (2010) The northwestern-ucla youth emotion project: associations of cognitive vulnerabilities, neuroticism and gender with past diagnoses of emotional disorders in adolescents. Behav Res Therapy 48(5):347–358
Article Google Scholar
Shahroudy A, Liu J, Ng TT, Wang G (2016) Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Akiba T, Suzuki S, Fukuda K (2017) Extremely large minibatch sgd: training resnet-50 on imagenet in 15 minutes. arXiv preprint arXiv:171104325
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3d skeletons as points in a lie group. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 588–595
Xiao Y, Chen J, Wang Y, Cao Z, Zhou JT, Bai X (2019) Action recognition for depth video using multi-view dynamic images. Inf Sci 480:287–304
Article Google Scholar
Wang H, Wang L (2018) Learning content and style: joint action recognition and person identification from human skeletons. Pattern Recognit 81:23–35
Article Google Scholar
Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In: Proceedings of the IEEE international conference on computer vision, pp 1012–1020
Anirudh R, Turaga P, Su J, Srivastava A (2015) Elastic functional coding of human actions: from vector-fields to latent variables. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3147–3155
Zhang P, Lan C, Zeng W, Xue J, Zheng N (2019) Semantics-guided neural networks for efficient skeleton-based human action recognition. arXiv preprint arXiv:190401189
Nie Q, Wang J, Wang X, Liu Y (2019) View-invariant human action recognition based on a 3d bio-constrained skeleton model. IEEE Trans Image Process 28:3959–3972
Article MathSciNet MATH Google Scholar
Zhang S, Yang Y, Xiao J, Liu X, Yang Y, Xie D, Zhuang Y (2018) Fusing geometric features for skeleton-based action recognition using multilayer lstm networks. IEEE Trans Multimed 20(9):2330–2343
Article Google Scholar
Wang P, Li W, Wan J, Ogunbona P, Liu X (2018) Cooperative training of deep aggregation networks for rgb-d action recognition. In: Thirty-second AAAI conference on artificial intelligence
Li B, Li X, Zhang Z, Wu F (2019) Spatio-temporal graph routing for skeleton-based action recognition. Proc AAAI Conf Artif Intell 33:8561–8568
Google Scholar
Xie C, Li C, Zhang B, Chen C, Han J, Zou C, Liu J (2018) Memory attention networks for skeleton-based action recognition. arXiv preprint arXiv:180408254
Zhu J, Zou W, Zhu Z, Hu Y (2019) Convolutional relation network for skeleton-based action recognition. Neurocomputing 370:109–117
Article Google Scholar
Shi L, Zhang Y, Cheng J, Lu H (2019) Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7912–7921
Zhang T, Su G, Qing C, Xu X, Cai B, Xing X (2019) Hierarchical lifelong learning by sharing representations and integrating hypothesis. IEEE Trans Syst Man Cybern Syst
He W, Gao H, Zhou C, Yang C, Li Z (2020) Reinforcement learning control of a flexible two-link manipulator: an experimental investigation. IEEE Trans Syst Man Cybern Syst
Zhang T, Chen CP, Chen L, Xu X, Hu B (2018) Design of highly nonlinear substitution boxes based on i-ching operators. IEEE Trans Cybern 48(12):3349–3358
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China under Number 2019YFA0706200 and 2019YFB1703600, the National Natural Science Foundation of China Grant under Number U1813203, U1801262, 61751202, 61751205.

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
Yingguo Jiang & Tong Zhang
Unit 95269 of the People’s Liberation Army, Guangzhou, 510075, China
Jun Xu

Authors

Yingguo Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Tong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Y., Xu, J. & Zhang, T. View-independent representation with frame interpolation method for skeleton-based human action recognition. Int. J. Mach. Learn. & Cyber. 11, 2625–2636 (2020). https://doi.org/10.1007/s13042-020-01132-4

Download citation

Received: 28 February 2020
Accepted: 10 April 2020
Published: 05 May 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s13042-020-01132-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

View-independent representation with frame interpolation method for skeleton-based human action recognition

Abstract

Access this article

Similar content being viewed by others

View Invariant Human Action Recognition Using 3D Geometric Features

Viewpoint guided multi-stream neural network for skeleton action recognition

Rotation-based spatial–temporal feature learning from skeleton sequences for action recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

View-independent representation with frame interpolation method for skeleton-based human action recognition

Abstract

Access this article

Similar content being viewed by others

View Invariant Human Action Recognition Using 3D Geometric Features

Viewpoint guided multi-stream neural network for skeleton action recognition

Rotation-based spatial–temporal feature learning from skeleton sequences for action recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation