Crowd Scene Analysis Encounters High Density and Scale Variation,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Crowd Scene Analysis Encounters High Density and Scale Variation
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2021-01-27 , DOI: 10.1109/tip.2021.3049963
Yao Xue , Yonghui Li , Siming Liu , Xingjun Zhang , Xueming Qian

Crowd scene analysis receives growing attention due to its wide applications. Grasping the accurate crowd location is important for identifying high-risk regions. In this article, we propose a Compressed Sensing based Output Encoding (CSOE) scheme, which casts detecting pixel coordinates of small objects into a task of signal regression in encoding signal space. To prevent gradient vanishing, we derive our own sparse reconstruction backpropagation rule that is adaptive to distinct implementations of sparse reconstruction and makes the whole model end-to-end trainable. With the support of CSOE and the backpropagation rule, the proposed method shows more robustness to deep model training error, which is especially harmful to crowd counting and localization. The proposed method achieves state-of-the-art performance across four mainstream datasets, especially achieves excellent results in highly crowded scenes. A series of analysis and experiments support our claim that regression in CSOE space is better than traditionally detecting coordinates of small objects in pixel space for highly crowded scenes.

中文翻译：

人群场景分析遇到高密度和规模变化

人群场景分析由于其广泛的应用而受到越来越多的关注。掌握准确的人群位置对于识别高风险区域很重要。在本文中，我们提出了一种基于压缩感知的输出编码（CSOE）方案，该方案将检测小物体的像素坐标转换为编码信号空间中的信号回归任务。为了防止梯度消失，我们导出了自己的稀疏重构反向传播规则，该规则适用于稀疏重构的不同实现，并使整个模型端到端可训练。在CSOE和反向传播规则的支持下，该方法对深度模型训练误差表现出更高的鲁棒性，对人群计数和局部定位尤其有害。所提出的方法可在四个主流数据集上实现最新性能，特别是在高度拥挤的场景中取得了出色的效果。一系列分析和实验证明了我们的观点，即CSOE空间中的回归要比传统上在高度拥挤的场景中检测像素空间中小物体的坐标要好。

更新日期：2021-02-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11