Decoupled Two-Stage Crowd Counting and Beyond,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Decoupled Two-Stage Crowd Counting and Beyond
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2021-02-04 , DOI: 10.1109/tip.2021.3055631
Jian Cheng , Haipeng Xiong , Zhiguo Cao , Hao Lu

One of appealing approaches to counting dense objects, such as crowd, is density map estimation. Density maps, however, present ambiguous appearance cues in congested scenes, rendering infeasibility in identifying individuals and difficulties in diagnosing errors. Inspired by an observation that counting can be interpreted as a two-stage process, i.e. , identifying possible object regions and counting exact object numbers, we introduce a probabilistic intermediate representation termed the probability map that depicts the probability of each pixel being an object. This representation allows us to decouple counting into probability map regression (PMR) and count map regression (CMR). We therefore propose a novel decoupled two-stage counting (D2C) framework that sequentially regresses the probability map and learns a counter conditioned on the probability map. Given the probability map and the count map, a peak point detection algorithm is derived to localize each object with a point under the guidance of local counts. An advantage of D2C is that the counter can be learned reliably with additional synthesized probability maps. This addresses important data deficiency and sample imbalanced problems in counting. Our framework also enables easy diagnoses and analyses of error patterns. For instance, we find that, the counter per se is sufficiently accurate, while the bottleneck appears to be PMR. We further instantiate a network D2CNet in our framework and report state-of-the-art counting and localization performance across 6 crowd counting benchmarks. Since the probability map is a representation independent of visual appearance, D2CNet also exhibits remarkable cross-dataset transferability. Code and pretrained models are made available at: https://git.io/d2cnet

中文翻译：

解耦的两阶段人群计数及其他

计数密集物体（例如人群）的一种吸引人的方法是密度图估计。然而，密度图在拥挤的场景中呈现出模棱两可的外观提示，导致无法识别个人并难以诊断错误。受观察结果启发，计数可以解释为两个阶段的过程，IE 为确定可能的物体区域并计算精确的物体数量，我们引入了一种称为概率图的概率中间表示形式，该概率图描述了每个像素是一个物体的概率。这种表示使我们可以将计数解耦为概率图回归（PMR）和计数图回归（CMR）。因此，我们提出了一种新颖的解耦两阶段计数（D2C）框架，该框架可以顺序回归概率图并学习以概率图为条件的计数器。给定概率图和计数图，导出峰值检测算法，以在局部计数的指导下用一个点定位每个对象。D2C的一个优点是可以使用其他合成的概率图可靠地学习计数器。这解决了重要的数据不足和计数中样品不平衡的问题。我们的框架还可以轻松诊断和分析错误模式。例如，我们发现，计数器本身足够准确，而瓶颈似乎是PMR。我们在我们的框架中进一步实例化了一个网络D2CNet，并报告了6个人群计数基准的最新计数和本地化性能。由于概率图是独立于视觉外观的表示形式，因此D2CNet还具有出色的跨数据集可传递性。代码和预训练模型可在以下位置获得：我们进一步在我们的框架中实例化一个网络D2CNet，并报告6个人群计数基准的最新计数和本地化性能。由于概率图是独立于视觉外观的表示形式，因此D2CNet还具有出色的跨数据集可传递性。代码和预训练模型可在以下位置获得：我们进一步在我们的框架中实例化一个网络D2CNet，并报告6个人群计数基准的最新计数和本地化性能。由于概率图是独立于视觉外观的表示形式，因此D2CNet还具有出色的跨数据集可传递性。代码和预训练模型可在以下位置获得：https://git.io/d2cnet

更新日期：2021-02-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>