Impact of Labeling Schemes on Dense Crowd Counting Using Convolutional Neural Networks with Multiscale Upsampling,International Journal of Pattern Recognition and Artificial Intelligence

当前位置： X-MOL 学术 › Int. J. Pattern Recognit. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Impact of Labeling Schemes on Dense Crowd Counting Using Convolutional Neural Networks with Multiscale Upsampling
International Journal of Pattern Recognition and Artificial Intelligence ( IF 1.5 ) Pub Date : 2021-09-15 , DOI: 10.1142/s0218001421600120
Greg Olmschenk _{1,

2} , Xuan Wang ₁ , Hao Tang ₃ , Zhigang Zhu _{1,

4}

Affiliation

Gatherings of thousands to millions of people frequently occur for an enormous variety of educational, social, sporting, and political events, and automated counting of these high-density crowds is useful for safety, management, and measuring significance of an event. In this work, we show that the regularly accepted labeling scheme of crowd density maps for training deep neural networks may not be the most effective one. We propose an alternative inverse k-nearest neighbor (ikNN) map mechanism that, even when used directly in existing state-of-the-art network structures, shows superior performance. We also provide new network architecture mechanisms that we demonstrate in our own MUD-ikNN network architecture, which uses multi-scale drop-in replacement upsampling via transposed convolutions to take full advantage of the provided ikNN labeling. This upsampling combined with the ikNN maps further improves crowd counting accuracy. We further analyze several variations of the ikNN labeling mechanism, which apply transformations on the kNN measure before generating the map, in order to consider the impact of camera perspective views, image resolutions, and the changing rates of the mapping functions. To alleviate the effects of crowd density changes in each image, we also introduce an attenuation mechanism in the ikNN mapping. Experimentally, we show that inverse square root kNN map variation (iRkNN) provides the best performance. Discussions are provided on computational complexity, label resolutions, the gains in mapping and upsampling, and details of critical cases such as various crowd counts, uneven crowd densities, and crowd occlusions.

中文翻译：

使用多尺度上采样的卷积神经网络标记方案对密集人群计数的影响

成千上万人的聚会经常发生在各种各样的教育、社交、体育和政治活动中，这些高密度人群的自动计数对于安全、管理和衡量事件的重要性很有用。在这项工作中，我们表明，通常接受的用于训练深度神经网络的人群密度图标记方案可能不是最有效的方案。我们提出了一个替代的逆 k 近邻（iķNN）映射机制，即使直接用于现有的最先进的网络结构，也显示出卓越的性能。我们还提供了我们在自己的 MUD-i 中演示的新网络架构机制ķNN 网络架构，它通过转置卷积使用多尺度插入替换上采样，以充分利用提供的 iķNN 标记。这种上采样与 iķNN 地图进一步提高了人群计数的准确性。我们进一步分析了 i 的几种变体ķNN 标记机制，它在ķNN 在生成地图之前进行测量，以考虑相机透视图、图像分辨率和映射函数变化率的影响。为了减轻每张图像中人群密度变化的影响，我们还在 i 中引入了一种衰减机制。ķ神经网络映射。通过实验，我们证明了平方根的倒数ķNN图变化（iRķNN) 提供最佳性能。讨论了计算复杂性、标签分辨率、映射和上采样的收益，以及各种人群计数、不均匀人群密度和人群遮挡等关键案例的细节。

更新日期：2021-09-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>