当前位置: X-MOL 学术IET Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dual attention convolutional network for action recognition
IET Image Processing ( IF 2.3 ) Pub Date : 2020-04-30 , DOI: 10.1049/iet-ipr.2019.0963
Xiaoqiang Li 1, 2 , Miao Xie 1 , Yin Zhang 1 , Guangtai Ding 1, 2 , Weiqin Tong 1, 2
Affiliation  

Action recognition has been an active research area for many years. Extracting discriminative spatial and temporal features of different actions plays a key role in accomplishing this task. Current popular methods of action recognition are mainly based on two-stream Convolutional Networks (ConvNets) or 3D ConvNets. However, the computational cost of two-stream ConvNets is high for the requirement of optical flow while 3D ConvNets takes too much memory because they have a large amount of parameters. To alleviate such problems, the authors propose a Dual Attention ConvNet (DANet) based on dual attention mechanism which consists of spatial attention and temporal attention. The former concentrates on main motion objects in a video frame by using ConvNet structure and the latter captures related information of multiple video frames by adopting self-attention. Their network is entirely based on 2D ConvNet and takes in only RGB frames. Experimental results on UCF-101 and HMDB-51 benchmarks demonstrate that DANet gets comparable results among leading methods, which proves the effectiveness of the dual attention mechanism.

中文翻译:

双注意力卷积网络用于动作识别

动作识别多年来一直是活跃的研究领域。提取不同动作的区分性时空特征在完成此任务中起着关键作用。当前流行的动作识别方法主要基于两流卷积网络(ConvNets)或3D ConvNets。但是,对于光流的要求,两流ConvNets的计算成本很高,而3D ConvNets由于具有大量参数,因此会占用太多内存。为了缓解此类问题,作者提出了一种基于双重注意机制的双重注意卷积网络(DANet),该机制由空间注意和时间注意组成。前者通过使用ConvNet结构专注于视频帧中的主要运动对象,而后者则采用自我关注来捕获多个视频帧的相关信息。他们的网络完全基于2D ConvNet,仅吸收RGB帧。在UCF-101和HMDB-51基准测试中的实验结果表明,DANet在领先方法中获得了可比的结果,证明了双重关注机制的有效性。
更新日期:2020-04-30
down
wechat
bug