当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2021-02-04 , DOI: 10.1109/tip.2021.3055632
Yi Wang , Junhui Hou , Xinyu Hou , Lap-Pui Chau

In this article, we propose a novel self-training approach named Crowd-SDNet that enables a typical object detector trained only with point-level annotations (i.e., objects are labeled with points) to estimate both the center points and sizes of crowded objects. Specifically, during training, we utilize the available point annotations to supervise the estimation of the center points of objects directly. Based on a locally-uniform distribution assumption, we initialize pseudo object sizes from the point-level supervisory information, which are then leveraged to guide the regression of object sizes via a crowdedness-aware loss. Meanwhile, we propose a confidence and order-aware refinement scheme to continuously refine the initial pseudo object sizes such that the ability of the detector is increasingly boosted to detect and count objects in crowds simultaneously. Moreover, to address extremely crowded scenes, we propose an effective decoding method to improve the detector’s representation ability. Experimental results on the WiderFace benchmark show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks, i.e., our method improves the average precision by more than 10% and reduces the counting error by 31.2%. Besides, our method obtains the best results on the crowd counting and localization datasets (i.e., ShanghaiTech and NWPU-Crowd) and vehicle counting datasets (i.e., CARPK and PUCPR+) compared with state-of-the-art counting-by-detection methods. The code will be publicly available at https://github.com/WangyiNTU/Point-supervised-crowd-detection .

中文翻译:

人群中点监督对象检测和计数的自训练方法

在本文中,我们提出了一种新颖的自训练方法,称为Crowd-SDNet,它使典型的仅通过点级注释训练的对象检测器(即,对象被标记为点)可以估计拥挤对象的中心点和大小。具体来说,在训练过程中,我们利用可用的点注释直接监督对象中心点的估计。基于局部一致的分布假设,我们从点级监管信息初始化伪对象大小,然后利用这些信息通过拥挤感知损失来指导对象大小的回归。同时,我们提出了一种置信度和顺序感知的细化方案,以不断细化初始伪对象的大小,从而使检测器同时检测和计数人群中的对象的能力日益增强。此外,针对极端拥挤的场景,我们提出了一种有效的解码方法,以提高检测器的表示能力。在WiderFace基准测试中的实验结果表明,在检测和计数任务下,我们的方法均明显优于最新的点监督方法,即,我们的方法将平均精度提高了10%以上,并将计数误差降低了31.2。 %。此外,我们的方法在人群计数和本地化数据集(即ShanghaiTech和NWPU-Crowd)和车辆计数数据集(即,CARPK和PUCPR +)与最新的检测计数方法进行了比较。该代码将在以下位置公开提供https://github.com/WangyiNTU/Point-supervised-crowd-detection
更新日期:2021-02-16
down
wechat
bug