Multi-feature-based crowd video modeling for visual event detection,Multimedia Systems

当前位置： X-MOL 学术 › Multimedia Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-feature-based crowd video modeling for visual event detection
Multimedia Systems ( IF 3.5 ) Pub Date : 2020-04-04 , DOI: 10.1007/s00530-020-00652-x
Habib Ullah , Ihtesham Ul Islam , Mohib Ullah , Muhammad Afaq , Sultan Daud Khan , Javed Iqbal

We propose a novel method for modeling crowd video dynamics by adopting a two-stream convolutional architecture which incorporates spatial and temporal networks. Our proposed method cope with the key challenge of capturing the complementary information on appearance from still frames and motion between frames. In our proposed method, a motion flow field is obtained from the video through dense optical flow. We demonstrate that the proposed method trained on multi-frame dense optical flow achieves significant improvement in performance in spite of limited training data. We train and evaluate our proposed method on a benchmark crowd video dataset. The experimental results of our method show that it outperforms five reference methods. We have chosen these reference methods since they are the most relevant to our work.

中文翻译：

用于视觉事件检测的基于多特征的人群视频建模

我们提出了一种通过采用结合了空间和时间网络的双流卷积架构来建模人群视频动态的新方法。我们提出的方法解决了从静止帧和帧之间的运动捕获外观的补充信息的关键挑战。在我们提出的方法中，运动流场是通过密集光流从视频中获得的。我们证明，尽管训练数据有限，但所提出的在多帧密集光流上训练的方法仍能显着提高性能。我们在基准人群视频数据集上训练和评估我们提出的方法。我们方法的实验结果表明它优于五种参考方法。我们选择了这些参考方法，因为它们与我们的工作最相关。

更新日期：2020-04-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11