Revisiting Anchor Mechanisms for Temporal Action Localization.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Revisiting Anchor Mechanisms for Temporal Action Localization.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-08-19 , DOI: 10.1109/tip.2020.3016486
Le Yang , Houwen Peng , Dingwen Zhang , Jianlong Fu , Junwei Han

Most of the current action localization methods follow an anchor-based pipeline: depicting action instances by pre-defined anchors, learning to select the anchors closest to the ground truth, and predicting the confidence of anchors with refinements. Pre-defined anchors set prior about the location and duration for action instances, which facilitates the localization for common action instances but limits the flexibility for tackling action instances with drastic varieties, especially for extremely short or extremely long ones. To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points. Specifically, this module represents an action instance as a point with its distances to the starting boundary and ending boundary, alleviating the pre-defined anchor restrictions in terms of action localization and duration. The proposed anchor-free module is capable of predicting the action instances whose duration is either extremely short or extremely long. By combining the proposed anchor-free module with a conventional anchor-based module, we propose a novel action localization framework, called A2Net. The cooperation between anchor-free and anchor-based modules achieves superior performance to the state-of-the-art on THUMOS14 (45.5% vs. 42.8%). Furthermore, comprehensive experiments demonstrate the complementarity between the anchor-free and the anchor-based module, making A2Net simple but effective.

中文翻译：

重新审视时间动作定位的锚定机制。

当前大多数动作定位方法都遵循基于锚点的管道：通过预定义的锚点描述动作实例，学习选择最接近地面实况的锚点，并通过改进来预测锚点的置信度。预定义的锚点预先设置了动作实例的位置和持续时间，这有助于常见动作实例的本地化，但限制了处理变化剧烈的动作实例的灵活性，特别是对于极短或极长的动作实例。为了解决这个问题，本文提出了一种新颖的无锚动作定位模块，通过时间点辅助动作定位。具体来说，该模块将动作实例表示为一个点，其到起始边界和结束边界的距离，减轻了动作定位和持续时间方面的预定义锚点限制。所提出的无锚模块能够预测持续时间极短或极长的动作实例。通过将所提出的无锚模块与传统的基于锚的模块相结合，我们提出了一种新颖的动作定位框架，称为 A2Net。无锚模块和基于锚模块之间的合作在 THUMOS14 上实现了优于现有技术的性能（45.5% 与 42.8%）。此外，综合实验证明了anchor-free和anchor-based模块之间的互补性，使得A2Net简单而有效。

更新日期：2020-08-28

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11