当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Motion saliency based multi-stream multiplier ResNets for action recognition
Image and Vision Computing ( IF 4.7 ) Pub Date : 2021-01-11 , DOI: 10.1016/j.imavis.2021.104108
Ming Zong , Ruili Wang , Xiubo Chen , Zhe Chen , Yuanhao Gong

In this paper, we propose a Motion Saliency based multi-stream Multiplier ResNets (MSM-ResNets) for action recognition. The proposed MSM-ResNets model consists of three interactive streams: the appearance stream, motion stream and motion saliency stream. Similar to conventional two-stream CNNs models, the appearance stream and motion stream are responsible for capturing the appearance information and motion information, respectively, while the motion saliency stream is responsible for capturing the salient motion information. In particular, to effectively utilize the spatiotemporal interactive information between different streams, the proposed MSM-ResNets model establishes interactive connections between different streams instead of fusing three streams at the final output layer. Two kinds of different multiplicative connections are injected, the first one is to inject multiplicative connections from the motion stream to the appearance stream, while the second one is to inject multiplicative connections from the motion saliency stream to the motion stream. Experimental results verify the effectiveness of the proposed MSM-ResNets on two standard action recognition datasets: UCF101 and HMDB51.



中文翻译:

基于运动显着性的多流乘法器ResNets用于动作识别

在本文中,我们提出了一种基于运动显着性的多流乘数ResNets(MSM-ResNets),用于动作识别。提出的MSM-ResNets模型由三个交互式流组成:外观流,运动流和运动显着性流。与常规的两流CNN模型相似,外观流和运动流分别负责捕获外观信息和运动信息,而运动显着性流则负责捕获显着运动信息。特别是,为了有效利用不同流之间的时空交互信息,建议的MSM-ResNets模型在不同流之间建立交互连接,而不是在最终输出层融合三个流。注入了两种不同的乘法连接,第一个是从运动流向外观流注入乘法连接,第二个是从运动显着流向运动流注入乘法连接。实验结果验证了建议的MSM-ResNets在两个标准动作识别数据集:UCF101和HMDB51上的有效性。

更新日期:2021-01-28
down
wechat
bug