当前位置: X-MOL 学术Symmetry › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos
Symmetry ( IF 2.940 ) Pub Date : 2021-04-12 , DOI: 10.3390/sym13040662
Zeyuan Hu , Eung-Joo Lee

Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams. However, it does not normally consider both long-distance aggregations and short-range motions. Thus, to solve these problems, we propose hierarchical excitation aggregation and disentanglement networks (Hi-EADNs), which include multiple frame excitation aggregation (MFEA) and a feature squeeze-and-excitation hierarchical disentanglement (SEHD) module. MFEA specifically uses long-short range motion modelling and calculates the feature-level temporal difference. The SEHD module utilizes these differences to optimize the weights of each spatiotemporal feature and excite motion-sensitive channels. Moreover, without introducing additional parameters, this feature information is processed with a series of squeezes and excitations, and multiple temporal aggregations with neighbourhoods can enhance the interaction of different motion frames. Extensive experimental results confirm our proposed Hi-EADN method effectiveness on the UCF101 and HMDB51 benchmark datasets, where the top-5 accuracy is 93.5% and 76.96%.

中文翻译:

Hi-EADN:基于视频的动作识别的分层激励聚合和解缠解框架

大多数现有的视频动作识别方法主要依赖于卷积神经网络(CNN)的高级语义信息,但忽略了不同信息流的差异。但是,它通常不会同时考虑长距离聚合和短距离运动。因此,为解决这些问题,我们提出了分层激励聚合和解缠结网络(Hi-EADN),其中包括多帧激励聚合(MFEA)和特征挤压和激励分层解缠结(SEHD)模块。MFEA特别使用长短距离运动建模并计算特征级时差。SEHD模块利用这些差异来优化每个时空特征的权重并激发运动敏感通道。而且,在不引入其他参数的情况下,通过一系列挤压和激励来处理此特征信息,并且具有邻域的多个时间聚合可以增强不同运动帧的交互。大量的实验结果证实了我们提出的Hi-EADN方法在UCF101和HMDB51基准数据集上的有效性,前5位的准确度分别为93.5%和76.96%。
更新日期:2021-04-12
down
wechat
bug