Towards Using Count-level Weak Supervision for Crowd Counting,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards Using Count-level Weak Supervision for Crowd Counting
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.patcog.2020.107616
Yinjie Lei , Yan Liu , Pingping Zhang , Lingqiao Liu

Most existing crowd counting methods require object location-level annotation, i.e., placing a dot at the center of an object. While being simpler than the bounding-box or pixel-level annotation, obtaining this annotation is still labor-intensive and time-consuming especially for images with highly crowded scenes. On the other hand, weaker annotations that only know the total count of objects can be almost effortless in many practical scenarios. Thus, it is desirable to develop a learning method that can effectively train models from count-level annotations. To this end, this paper studies the problem of weakly-supervised crowd counting which learns a model from only a small amount of location-level annotations (fully-supervised) but a large amount of count-level annotations (weakly-supervised). To perform effective training in this scenario, we observe that the direct solution of regressing the integral of density map to the object count is not sufficient and it is beneficial to introduce stronger regularizations on the predicted density map of weakly-annotated images. We devise a simple-yet-effective training strategy, namely Multiple Auxiliary Tasks Training (MATT), to construct regularizes for restricting the freedom of the generated density maps. Through extensive experiments on existing datasets and a newly proposed dataset, we validate the effectiveness of the proposed weakly-supervised method and demonstrate its superior performance over existing solutions.

中文翻译：

使用计数级弱监督进行人群计数

大多数现有的人群计数方法需要对象位置级别的注释，即在对象的中心放置一个点。虽然比边界框或像素级注释更简单，但获得这种注释仍然是劳动密集型和耗时的，尤其是对于场景高度拥挤的图像。另一方面，在许多实际场景中，仅知道对象总数的较弱注释几乎可以毫不费力。因此，需要开发一种可以有效地从计数级别注释训练模型的学习方法。为此，本文研究了弱监督人群计数问题，该问题仅从少量位置级注释（全监督）但从大量计数级注释（弱监督）中学习模型。为了在这种情况下进行有效的培训，我们观察到将密度图的积分回归到对象计数的直接解决方案是不够的，在弱注释图像的预测密度图上引入更强的正则化是有益的。我们设计了一种简单而有效的训练策略，即多辅助任务训练 (MATT)，以构建正则化以限制生成的密度图的自由度。通过对现有数据集和新提出的数据集的大量实验，我们验证了所提出的弱监督方法的有效性，并证明了其优于现有解决方案的性能。

更新日期：2021-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11