当前位置: X-MOL 学术Electronics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Learning Methods for 3D Human Pose Estimation under Different Supervision Paradigms: A Survey
Electronics ( IF 2.9 ) Pub Date : 2021-09-15 , DOI: 10.3390/electronics10182267
Dejun Zhang , Yiqi Wu , Mingyue Guo , Yilin Chen

The rise of deep learning technology has broadly promoted the practical application of artificial intelligence in production and daily life. In computer vision, many human-centered applications, such as video surveillance, human-computer interaction, digital entertainment, etc., rely heavily on accurate and efficient human pose estimation techniques. Inspired by the remarkable achievements in learning-based 2D human pose estimation, numerous research studies are devoted to the topic of 3D human pose estimation via deep learning methods. Against this backdrop, this paper provides an extensive literature survey of recent literature about deep learning methods for 3D human pose estimation to display the development process of these research studies, track the latest research trends, and analyze the characteristics of devised types of methods. The literature is reviewed, along with the general pipeline of 3D human pose estimation, which consists of human body modeling, learning-based pose estimation, and regularization for refinement. Different from existing reviews of the same topic, this paper focus on deep learning-based methods. The learning-based pose estimation is discussed from two categories: single-person and multi-person. Each one is further categorized by data type to the image-based methods and the video-based methods. Moreover, due to the significance of data for learning-based methods, this paper surveys the 3D human pose estimation methods according to the taxonomy of supervision form. At last, this paper also enlists the current and widely used datasets and compares performances of reviewed methods. Based on this literature survey, it can be concluded that each branch of 3D human pose estimation starts with fully-supervised methods, and there is still much room for multi-person pose estimation based on other supervision methods from both image and video. Besides the significant development of 3D human pose estimation via deep learning, the inherent ambiguity and occlusion problems remain challenging issues that need to be better addressed.

中文翻译:

不同监督范式下 3D 人体姿态估计的深度学习方法:调查

深度学习技术的兴起,广泛推动了人工智能在生产和日常生活中的实际应用。在计算机视觉中,许多以人为中心的应用,如视频监控、人机交互、数字娱乐等,在很大程度上依赖于准确高效的人体姿态估计技术。受到基于学习的 2D 人体姿态估计取得的显著成果的启发,大量研究致力于通过深度学习方法进行 3D 人体姿态估计。在此背景下,本文对近期有关 3D 人体姿态估计的深度学习方法的文献进行了广泛的文献调查,以展示这些研究的发展过程,跟踪最新的研究趋势,并分析所设计的方法类型的特点。回顾了文献以及 3D 人体姿势估计的一般流程,其中包括人体建模、基于学习的姿势估计和用于细化的正则化。与现有的同一主题评论不同,本文侧重于基于深度学习的方法。基于学习的姿态估计从两类讨论:单人和多人。每一种都按数据类型进一步分类为基于图像的方法和基于视频的方法。此外,由于数据对于基于学习的方法的重要性,本文根据监督形式的分类法调查了 3D 人体姿态估计方法。最后,本文还列出了当前和广泛使用的数据集,并比较了所审查方法的性能。根据本次文献调查,可以得出结论,3D人体姿态估计的每个分支都是从全监督方法开始的,基于图像和视频的其他监督方法的多人姿态估计还有很大的空间。除了通过深度学习进行 3D 人体姿态估计的重大发展之外,固有的模糊性和遮挡问题仍然是需要更好解决的具有挑战性的问题。
更新日期:2021-09-15
down
wechat
bug