当前位置: X-MOL 学术Int. J. Pattern Recognit. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Comparative Study of Transfer Learning Approaches for Video Anomaly Detection
International Journal of Pattern Recognition and Artificial Intelligence ( IF 0.9 ) Pub Date : 2020-12-09 , DOI: 10.1142/s0218001421520030
Matheus Gutoski 1 , Manassés Ribeiro 1 , Leandro T. Hattori 1 , Marcelo Romero 1 , André E. Lazzaretti 1 , Heitor S. Lopes 1
Affiliation  

Recent research has shown that features obtained from pretrained Convolutional Neural Network (CNN) models can be promptly applied to a variety of problems they were not originally designed to solve. This concept, often referred to as Transfer Learning (TL), is a common practice when labeled data is limited. In some fields, such as video anomaly detection, TL is still an underexplored subject in the sense that it is not clear whether the architecture of the pretrained CNN model impacts on the video anomaly detection performance. In order to clarify this issue, we perform an extensive benchmark using 12 different pretrained CNN models on ImageNet as feature extractors and apply the features obtained to seven video anomaly detection benchmark datasets. This work presents some interesting findings about video anomaly detection using TL. The highlights of our findings were revealed by our experiments, which have shown that a simple classification process using One-Class Support Vector Machines yields similar results to state-of-the-art models. Moreover, a statistical analysis suggests that architectural differences are negligible when choosing a pretrained model for video anomaly detection, since all models presented similar performance. At last, we present an in-depth visual analysis of the Avenue dataset, and reveal several aspects that may be limiting the performance of state-of-the-art video anomaly detection methods.

中文翻译:

用于视频异常检测的迁移学习方法的比较研究

最近的研究表明,从预训练的卷积神经网络 (CNN) 模型中获得的特征可以迅速应用于它们最初没有设计解决的各种问题。这个概念,通常称为迁移学习 (TL),是标记数据有限时的常见做法。在某些领域,例如视频异常检测,TL 仍然是一个未充分探索的主题,因为尚不清楚预训练的 CNN 模型的架构是否会影响视频异常检测性能。为了澄清这个问题,我们在 ImageNet 上使用 12 个不同的预训练 CNN 模型作为特征提取器进行了广泛的基准测试,并将获得的特征应用于七个视频异常检测基准数据集。这项工作提出了一些关于使用 TL 进行视频异常检测的有趣发现。我们的实验揭示了我们发现的亮点,这些实验表明,使用一类支持向量机的简单分类过程会产生与最先进模型相似的结果。此外,统计分析表明,在为视频异常检测选择预训练模型时,架构差异可以忽略不计,因为所有模型都表现出相似的性能。最后,我们对 Avenue 数据集进行了深入的视觉分析,并揭示了可能限制最先进的视频异常检测方法性能的几个方面。统计分析表明,在为视频异常检测选择预训练模型时,架构差异可以忽略不计,因为所有模型都表现出相似的性能。最后,我们对 Avenue 数据集进行了深入的视觉分析,并揭示了可能限制最先进的视频异常检测方法性能的几个方面。统计分析表明,在为视频异常检测选择预训练模型时,架构差异可以忽略不计,因为所有模型都表现出相似的性能。最后,我们对 Avenue 数据集进行了深入的视觉分析,并揭示了可能限制最先进的视频异常检测方法性能的几个方面。
更新日期:2020-12-09
down
wechat
bug