当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-05-06 , DOI: arxiv-2105.02440
Longyin Wen, Dawei Du, Pengfei Zhu, Qinghua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured largescale dataset, named as DroneCrowd, formed by 112 video clips with 33,600 HD frames in various scenarios. Notably, we annotate 20,800 people trajectories with 4.8 million heads and several video-level attributes. Meanwhile, we design the Space-Time Neighbor-Aware Network (STNNet) as a strong baseline to solve object detection, tracking and counting jointly in dense crowds. STNNet is formed by the feature extraction module, followed by the density map estimation heads, and localization and association subnets. To exploit the context information of neighboring objects, we design the neighboring context loss to guide the association subnet training, which enforces consistent relative position of nearby objects in temporal domain. Extensive experiments on our DroneCrowd dataset demonstrate that STNNet performs favorably against the state-of-the-arts.

中文翻译:

检测,跟踪和计数遇到人群中的无人机:基准

为了促进无人机捕获的视频中对象检测,跟踪和计数算法的发展,我们使用新的无人机捕获的大规模数据集(称为DroneCrowd)构建了基准,该数据集由112个视频剪辑和33,600个高清帧组成。值得注意的是,我们用480万个头部和几个视频级属性来注释20,800人的轨迹。同时,我们将空时邻居感知网络(STNNet)设计为一个强大的基线,以解决密集人群中的物体检测,联合跟踪和计数。STNNet由特征提取模块,随后的密度图估计头,定位和关联子网组成。为了利用邻近对象的上下文信息,我们设计了邻近上下文损失来指导关联子网训练,从而在时域中强制附近对象的一致相对位置。在我们的DroneCrowd数据集上进行的大量实验表明,STNNet在最新技术方面表现出色。
更新日期:2021-05-07
down
wechat
bug