MPASNET: Motion Prior-Aware Siamese Network for Unsupervised Deep Crowd Segmentation in Video Scenes,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MPASNET: Motion Prior-Aware Siamese Network for Unsupervised Deep Crowd Segmentation in Video Scenes
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-21 , DOI: arxiv-2101.08609
Jinhai Yang, Hua Yang

Crowd segmentation is a fundamental task serving as the basis of crowded scene analysis, and it is highly desirable to obtain refined pixel-level segmentation maps. However, it remains a challenging problem, as existing approaches either require dense pixel-level annotations to train deep learning models or merely produce rough segmentation maps from optical or particle flows with physical models. In this paper, we propose the Motion Prior-Aware Siamese Network (MPASNET) for unsupervised crowd semantic segmentation. This model not only eliminates the need for annotation but also yields high-quality segmentation maps. Specially, we first analyze the coherent motion patterns across the frames and then apply a circular region merging strategy on the collective particles to generate pseudo-labels. Moreover, we equip MPASNET with siamese branches for augmentation-invariant regularization and siamese feature aggregation. Experiments over benchmark datasets indicate that our model outperforms the state-of-the-arts by more than 12% in terms of mIoU.

中文翻译：

MPASNET：运动先验连体网络，用于视频场景中的无监督深人群分割

人群分割是作为拥挤场景分析的基础的基本任务，非常需要获得精炼的像素级分割图。然而，这仍然是一个具有挑战性的问题，因为现有方法要么需要密集的像素级注释来训练深度学习模型，要么仅通过光学或粒子流以及物理模型生成粗糙的分割图。在本文中，我们提出了运动先验感知连体网络（MPASNET），用于无监督的人群语义分割。该模型不仅消除了注释的需要，而且还可以生成高质量的分割图。特别地，我们首先分析跨帧的相干运动模式，然后对集合粒子应用圆形区域合并策略以生成伪标记。此外，我们为MPASNET配备了暹罗分支以进行增广不变正则化和暹罗特征聚合。在基准数据集上进行的实验表明，就mIoU而言，我们的模型比最新技术高出12％以上。

更新日期：2021-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>