当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PaStaNet: Toward Human Activity Knowledge Engine
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-02 , DOI: arxiv-2004.00945
Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu

Existing image-based activity understanding methods mainly adopt direct mapping, i.e. from image to activity concepts, which may encounter performance bottleneck since the huge gap. In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics. Human Body Part States (PaSta) are fine-grained action semantic tokens, e.g. , which can compose the activities and help us step toward human activity knowledge engine. To fully utilize the power of PaSta, we build a large-scale knowledge base PaStaNet, which contains 7M+ PaSta annotations. And two corresponding models are proposed: first, we design a model named Activity2Vec to extract PaSta features, which aim to be general representations for various activities. Second, we use a PaSta-based Reasoning method to infer activities. Promoted by PaStaNet, our method achieves significant improvements, e.g. 6.4 and 13.9 mAP on full and one-shot sets of HICO in supervised learning, and 3.2 and 4.2 mAP on V-COCO and images-based AVA in transfer learning. Code and data are available at http://hake-mvig.cn/.

中文翻译:

PaStaNet:迈向人类活动知识引擎

现有的基于图像的活动理解方法主要采用直接映射,即从图像到活动概念,由于巨大的差距可能会遇到性能瓶颈。有鉴于此,我们提出了一条新路径:先推断人体部位状态,然后根据部位级语义推理出活动。人体部位状态 (PaSta) 是细粒度的动作语义标记,例如,它可以组成活动,帮助我们迈向人类活动知识引擎。为了充分利用 PaSta 的强大功能,我们构建了一个大型知识库 PaStaNet,其中包含 700 万个 PaSta 注释。并提出了两个相应的模型:首先,我们设计了一个名为 Activity2Vec 的模型来提取 PaSta 特征,旨在成为各种活动的一般表示。其次,我们使用基于 PaSta 的推理方法来推断活动。在 PaStaNet 的推动下,我们的方法取得了显着的改进,例如在监督学习中的完整和单次 HICO 集上的 6.4 和 13.9 mAP,以及在迁移学习中的 V-COCO 和基于图像的 AVA 上的 3.2 和 4.2 mAP。代码和数据可在http://hake-mvig.cn/获得。
更新日期:2020-04-22
down
wechat
bug