A novel keyframe extraction method for video classification using deep neural networks,Neural Computing and Applications

当前位置： X-MOL 学术 › Neural Comput. & Applic. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A novel keyframe extraction method for video classification using deep neural networks
Neural Computing and Applications ( IF 4.5 ) Pub Date : 2021-08-02 , DOI: 10.1007/s00521-021-06322-x
Rukiye Savran Kızıltepe ₁ , John Q. Gan ₁ , Juan José Escobar ₂

Affiliation

Combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) produces a powerful architecture for video classification problems as spatial–temporal information can be processed simultaneously and effectively. Using transfer learning, this paper presents a comparative study to investigate how temporal information can be utilized to improve the performance of video classification when CNNs and RNNs are combined in various architectures. To enhance the performance of the identified architecture for effective combination of CNN and RNN, a novel action template-based keyframe extraction method is proposed by identifying the informative region of each frame and selecting keyframes based on the similarity between those regions. Extensive experiments on KTH and UCF-101 datasets with ConvLSTM-based video classifiers have been conducted. Experimental results are evaluated using one-way analysis of variance, which reveals the effectiveness of the proposed keyframe extraction method in the sense that it can significantly improve video classification accuracy.

中文翻译：

一种基于深度神经网络的视频分类关键帧提取新方法

结合卷积神经网络 (CNN) 和循环神经网络 (RNN) 为视频分类问题提供了强大的架构，因为可以同时有效地处理时空信息。使用迁移学习，本文提出了一项比较研究，以研究当 CNN 和 RNN 在各种架构中组合时如何利用时间信息来提高视频分类的性能。为了提高识别架构的性能以实现 CNN 和 RNN 的有效组合，提出了一种新的基于动作模板的关键帧提取方法，通过识别每个帧的信息区域并根据这些区域之间的相似性选择关键帧。使用基于 ConvLSTM 的视频分类器对 KTH 和 UCF-101 数据集进行了大量实验。使用单向方差分析对实验结果进行评估，这揭示了所提出的关键帧提取方法的有效性，因为它可以显着提高视频分类精度。

更新日期：2021-08-02

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文