当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Few-shot Action Recognition with Prototype-centered Attentive Learning
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-20 , DOI: arxiv-2101.08085
Xiatian Zhu, Antoine Toisoul, Juan-Manuel Prez-Ra, Li Zhang, Brais Martinez, Tao Xiang

Few-shot action recognition aims to recognize action classes with few training samples. Most existing methods adopt a meta-learning approach with episodic training. In each episode, the few samples in a meta-training task are split into support and query sets. The former is used to build a classifier, which is then evaluated on the latter using a query-centered loss for model updating. There are however two major limitations: lack of data efficiency due to the query-centered only loss design and inability to deal with the support set outlying samples and inter-class distribution overlapping problems. In this paper, we overcome both limitations by proposing a new Prototype-centered Attentive Learning (PAL) model composed of two novel components. First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective, in order to make full use of the limited training samples in each episode. Second, PAL further integrates a hybrid attentive learning mechanism that can minimize the negative impacts of outliers and promote class separation. Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

中文翻译:

以原型为中心的注意力学习的动作动作识别很少

很少动作识别旨在以很少的训练样本来识别动作类。现有的大多数方法都采用了元学习方法,并进行了间歇式训练。在每个情节中,元训练任务中的几个样本都分为支持集和查询集。前者用于构建分类器,然后使用以查询为中心的损失对后者进行评估,以进行模型更新。但是,存在两个主要局限性:由于仅以查询为中心的损失设计导致数据效率不足,并且无法处理支持集的外围样本和类间分布重叠问题。在本文中,我们通过提出一个新的以原型为中心的注意力学习(PAL)模型(由两个新颖的组件组成),克服了这两个限制。第一,为了充分利用每个情节中有限的训练样本,引入了以原型为中心的对比学习损失,以补充传统的以查询为中心的学习目标。其次,PAL进一步集成了混合式专心学习机制,可以最大程度地减少离群值的负面影响并促进班级分离。在四个标准的少拍动作基准上进行的大量实验表明,我们的方法明显优于以前的最新方法,并且在最具挑战性的细粒度动作识别基准上,改进特别显着(10%以上)。PAL还集成了一种混合式专心学习机制,可以最大程度地减少离群值的负面影响并促进班级分离。在四个标准的少拍动作基准上进行的大量实验表明,我们的方法明显优于以前的最新方法,并且在最具挑战性的细粒度动作识别基准上,改进特别显着(10%以上)。PAL还集成了一种混合式专心学习机制,可以最大程度地减少离群值的负面影响并促进班级分离。在四个标准的少拍动作基准上进行的大量实验表明,我们的方法明显优于以前的最新方法,并且在最具挑战性的细粒度动作识别基准上,改进特别显着(10%以上)。
更新日期:2021-01-21
down
wechat
bug