当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Locating and Counting Heads in Crowds With a Depth Prior.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2022-11-07 , DOI: 10.1109/tpami.2021.3124956
Dongze Lian , Xianing Chen , Jing Li , Weixin Luo , Shenghua Gao

To simultaneously estimate the number of heads and locate heads with bounding boxes, we resort to detection-based crowd counting by leveraging RGB-D data and design a dual-path guided detection network (DPDNet). Specifically, to improve the performance of detection-based approaches for dense/tiny heads, we propose a density map guided detection module, which leverages density map to improve the head/non-head classification in detection network where the density implies the probability of a pixel being a head, and a depth-adaptive kernel that considers the variances in head sizes is also introduced to generate high-fidelity density map for more robust density map regression. In order to prevent dense heads from being filtered out during post-processing, we utilize such a density map for post-processing of head detection and propose a density map guided NMS strategy. Meanwhile, to improve the ability of detecting small heads, we also propose a depth-guided detection module to generate a dynamic dilated convolution to extract features of heads of different scales, and a depth-aware anchor is further designed for better initialization of anchor sizes in the detection framework. Then we use the bounding boxes whose sizes are generated with depth to train our DPDNet. Considering that existing RGB-D datasets are too small and not suitable for performance evaluation of data-driven based approaches, we collect two large-scale RGB-D crowd counting datasets, which comprise a synthetic dataset and a real-world dataset, respectively. Since the depth value at long-distance positions cannot be obtained in the real-world dataset, we further propose a depth completion method with meta learning, which fully utilizes the synthetic depth data to complete the depth value at long-distance positions. Extensive experiments on our proposed two RGB-D datasets and the MICC RGB-D counting dataset show that our method achieves the best performance for RGB-D crowd counting and localization. Further, our method can be easily extended to RGB image based crowd counting and achieves comparable or even better performance on the RGB datasets for both head counting and localization.

中文翻译:

使用深度先验在人群中定位和计数人头。

为了同时估计头部数量并使用边界框定位头部,我们利用 RGB-D 数据采用基于检测的人群计数,并设计了双路径引导检测网络 (DPDNet)。具体来说,为了提高针对密集/微小头部的基于检测的方法的性能,我们提出了一个密度图引导检测模块,它利用密度图来改进检测网络中的头部/非头部分类,其中密度意味着像素是头部,还引入了考虑头部大小变化的深度自适应内核来生成高保真密度图,以实现更稳健的密度图回归。为了防止在后处理过程中过滤掉密集的头部,我们利用这样的密度图进行头部检测的后处理,并提出了一种密度图引导的 NMS 策略。同时,为了提高检测小头的能力,我们还提出了一个深度引导检测模块来生成动态扩张卷积来提取不同尺度头部的特征,并进一步设计了深度感知锚以更好地初始化锚尺寸在检测框架中。然后我们使用其大小随深度生成的边界框来训练我们的 DPDNet。考虑到现有的 RGB-D 数据集太小,不适合基于数据驱动的方法的性能评估,我们收集了两个大规模的 RGB-D 人群计数数据集,分别包含一个合成数据集和一个真实世界的数据集。由于在现实世界的数据集中无法获得远距离位置的深度值,我们进一步提出了一种基于元学习的深度补全方法,充分利用合成深度数据来补全远距离位置的深度值。对我们提出的两个 RGB-D 数据集和 MICC RGB-D 计数数据集的大量实验表明,我们的方法在 RGB-D 人群计数和定位方面取得了最佳性能。此外,我们的方法可以轻松扩展到基于 RGB 图像的人群计数,并在 RGB 数据集上实现可比甚至更好的人头计数和定位性能。对我们提出的两个 RGB-D 数据集和 MICC RGB-D 计数数据集的大量实验表明,我们的方法在 RGB-D 人群计数和定位方面取得了最佳性能。此外,我们的方法可以轻松扩展到基于 RGB 图像的人群计数,并在 RGB 数据集上实现可比甚至更好的人头计数和定位性能。对我们提出的两个 RGB-D 数据集和 MICC RGB-D 计数数据集的大量实验表明,我们的方法在 RGB-D 人群计数和定位方面取得了最佳性能。此外,我们的方法可以轻松扩展到基于 RGB 图像的人群计数,并在 RGB 数据集上实现可比甚至更好的人头计数和定位性能。
更新日期:2021-11-04
down
wechat
bug