PaStaNet: Toward Human Activity Knowledge Engine,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PaStaNet: Toward Human Activity Knowledge Engine
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-02 , DOI: arxiv-2004.00945
Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu

Existing image-based activity understanding methods mainly adopt direct mapping, i.e. from image to activity concepts, which may encounter performance bottleneck since the huge gap. In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics. Human Body Part States (PaSta) are fine-grained action semantic tokens, e.g. , which can compose the activities and help us step toward human activity knowledge engine. To fully utilize the power of PaSta, we build a large-scale knowledge base PaStaNet, which contains 7M+ PaSta annotations. And two corresponding models are proposed: first, we design a model named Activity2Vec to extract PaSta features, which aim to be general representations for various activities. Second, we use a PaSta-based Reasoning method to infer activities. Promoted by PaStaNet, our method achieves significant improvements, e.g. 6.4 and 13.9 mAP on full and one-shot sets of HICO in supervised learning, and 3.2 and 4.2 mAP on V-COCO and images-based AVA in transfer learning. Code and data are available at http://hake-mvig.cn/.

中文翻译：

PaStaNet：迈向人类活动知识引擎

现有的基于图像的活动理解方法主要采用直接映射，即从图像到活动概念，由于巨大的差距可能会遇到性能瓶颈。有鉴于此，我们提出了一条新路径：先推断人体部位状态，然后根据部位级语义推理出活动。人体部位状态 (PaSta) 是细粒度的动作语义标记，例如，它可以组成活动，帮助我们迈向人类活动知识引擎。为了充分利用 PaSta 的强大功能，我们构建了一个大型知识库 PaStaNet，其中包含 700 万个 PaSta 注释。并提出了两个相应的模型：首先，我们设计了一个名为 Activity2Vec 的模型来提取 PaSta 特征，旨在成为各种活动的一般表示。其次，我们使用基于 PaSta 的推理方法来推断活动。在 PaStaNet 的推动下，我们的方法取得了显着的改进，例如在监督学习中的完整和单次 HICO 集上的 6.4 和 13.9 mAP，以及在迁移学习中的 V-COCO 和基于图像的 AVA 上的 3.2 和 4.2 mAP。代码和数据可在http://hake-mvig.cn/获得。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>