Effective action recognition with embedded key point shifts,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Effective action recognition with embedded key point shifts
Pattern Recognition ( IF 8 ) Pub Date : 2021-07-18 , DOI: 10.1016/j.patcog.2021.108172
Haozhi Cao ₁ , Yuecong Xu ₁ , Jianfei Yang ₁ , Kezhi Mao ₁ , Jianxiong Yin ₂ , Simon See ₂

Affiliation

Temporal feature extraction is an essential technique in video-based action recognition. Key points have been utilized in skeleton-based action recognition methods but they require costly key point annotation. In this paper, we propose a novel temporal feature extraction module, named Key Point Shifts Embedding Module ( $K P S E M$ ), to adaptively extract channel-wise key point shifts across video frames without key point annotation. Key points are adaptively extracted as feature points with maximum feature values at split regions and key point shifts are the spatial displacements of corresponding key points. The key point shifts are encoded as the overall temporal features via linear embedding layers in a multi-set manner. Our method achieves competitive performance through embedding key point shifts with trivial computational cost, achieving the state-of-the-art performance of $78.81 %$ on Mini-Kinetics and competitive performance on UCF101, Something-Something-v1 and HMDB51 datasets.

中文翻译：

嵌入关键点转换的有效动作识别

时间特征提取是基于视频的动作识别中必不可少的技术。关键点已被用于基于骨架的动作识别方法，但它们需要昂贵的关键点注释。在本文中，我们提出了一种新颖的时间特征提取模块，名为 Key Point Shifts Embedding Module（ $钾磷秒乙米$ )，在没有关键点注释的情况下自适应地提取跨视频帧的通道关键点偏移。关键点被自适应地提取为分割区域具有最大特征值的特征点，关键点位移是对应关键点的空间位移。关键点偏移通过线性嵌入层以多组方式编码为整体时间特征。我们的方法通过以微不足道的计算成本嵌入关键点偏移来实现有竞争力的性能，实现了最先进的性能 $78.81 %$ 在 UCF101、Something-Something-v1 和 HMDB51 数据集上的 Mini-Kinetics 和竞争性能。

更新日期：2021-07-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>