当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quo Vadis, Skeleton Action Recognition ?
arXiv - CS - Multimedia Pub Date : 2020-07-04 , DOI: arxiv-2007.02072
Pranay Gupta, Anirudh Thatipelli, Aditya Aggarwal, Shubh Maheshwari, Neel Trivedi, Sourav Das, Ravi Kiran Sarvadevabhatla

In this paper, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. To begin with, we benchmark state-of-the-art models on the NTU-120 dataset and provide multi-layered assessment of the results. To examine skeleton action recognition 'in the wild', we introduce Skeletics-152, a curated and 3-D pose-annotated subset of RGB videos sourced from Kinetics-700, a large-scale action dataset. The results from benchmarking the top performers of NTU-120 on Skeletics-152 reveal the challenges and domain gap induced by actions 'in the wild'. We extend our study to include out-of-context actions by introducing Skeleton-Mimetics, a dataset derived from the recently introduced Mimetics dataset. Finally, as a new frontier for action recognition, we introduce Metaphorics, a dataset with caption-style annotated YouTube videos of the popular social game Dumb Charades and interpretative dance performances. Overall, our work characterizes the strengths and limitations of existing approaches and datasets. It also provides an assessment of top-performing approaches across a spectrum of activity settings and via the introduced datasets, proposes new frontiers for human action recognition.

中文翻译:

Quo Vadis,骨架动作识别?

在本文中,我们研究了基于骨架的人类动作识别领域当前和即将到来的前沿领域。首先,我们在 NTU-120 数据集上对最先进的模型进行基准测试,并对结果进行多层次评估。为了检查“野外”的骨架动作识别,我们引入了 Skeletics-152,这是一个精选的 3-D 姿势注释 RGB 视频子集,来自 Kinetics-700,一个大型动作数据集。在 Skeletics-152 上对 NTU-120 的最佳表现者进行基准测试的结果揭示了由“野外”行动引起的挑战和领域差距。我们通过引入 Skeleton-Mimetics(一个源自最近引入的 Mimetics 数据集的数据集)来扩展我们的研究以包括上下文之外的动作。最后,作为动作识别的新前沿,我们引入了隐喻,一个包含流行社交游戏 Dumb Charades 和解释性舞蹈表演的标题风格注释 YouTube 视频的数据集。总的来说,我们的工作描述了现有方法和数据集的优势和局限性。它还提供了对一系列活动设置中表现最佳的方法的评估,并通过引入的数据集为人类行为识别提出了新的前沿领域。
更新日期:2020-07-07
down
wechat
bug