A Context Knowledge Map Guided Coarse-to-fine Action Recognition.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Context Knowledge Map Guided Coarse-to-fine Action Recognition.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-11-12 , DOI: 10.1109/tip.2019.2952088
Yanli Ji , Yue Zhan , Yang Yang , Xing Xu , Fumin Shen , Heng Tao Shen

Human actions involve a wide variety and a large number of categories, which leads to a big challenge in action recognition. However, according to similarities on human body poses, scenes, interactive objects, human actions can be grouped into some semantic groups, i.e sports, cooking, etc. Therefore, in this paper, we propose a novel approach which recognizes human actions from coarse to fine. Taking full advantage of contributions from high-level semantic contexts, a context knowledge map guided recognition method is designed to realize the coarse-to-fine procedure. In the approach, we define semantic contexts with interactive objects, scenes and body motions in action videos, and build a context knowledge map to automatically define coarse-grained groups. Then fine-grained classifiers are proposed to realize accurate action recognition. The coarse-to-fine procedure narrows action categories in target classifiers, so it is beneficial to improving recognition performance. We evaluate the proposed approach on the CCV, the HMDB-51, and the UCF101 database. Experiments verify its significant effectiveness, on average, improving more than 5% of recognition precisions than current approaches. Compared with the state-of-the-art, it also obtains outstanding performance. The proposed approach achieves higher accuracies of 93.1%, 95.4% and 74.5% in the CCV, the UCF-101 and the HMDB51 database, respectively.

中文翻译：

上下文知识图引导从粗到细的动作识别。

人类行为涉及种类繁多、类别众多，这给行为识别带来了巨大的挑战。然而，根据人体姿势、场景、交互对象的相似性，人类动作可以分为一些语义组，即运动、烹饪等。因此，在本文中，我们提出了一种从粗到浅的识别人类动作的新方法。美好的。充分利用高级语义上下文的贡献，设计了上下文知识图引导识别方法来实现从粗到细的过程。在该方法中，我们用动作视频中的交互对象、场景和身体动作来定义语义上下文，并构建上下文知识图谱来自动定义粗粒度组。然后提出细粒度分类器以实现准确的动作识别。由粗到细的过程缩小了目标分类器中的动作类别，因此有利于提高识别性能。我们在 CCV、HMDB-51 和 UCF101 数据库上评估了所提出的方法。实验验证了其显着的有效性，平均比现有方法提高了5%以上的识别精度。与state-of-the-art相比，它也获得了出色的性能。所提出的方法在 CCV、UCF-101 和 HMDB51 数据库中分别实现了 93.1%、95.4% 和 74.5% 的更高准确率。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11