当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning scene-specific object detectors based on a generative-discriminative model with minimal supervision
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2022-05-10 , DOI: 10.1016/j.patrec.2022.05.007
Dapeng Luo 1 , Siyuan Lei 1 , Peng Guo 1 , Changxin Gao 2 , Ying Chen 3 , Jinsheng Li 3 , Longsheng Wei 4
Affiliation  

One object class may show large variations due to diverse illuminations, backgrounds, and camera viewpoints in the multi-scene object detection task. Traditional object detection methods generally perform poorly under unconstrained video environments. To address this problem, many modern approaches provide deep hierarchical appearance representations for object detection. Most of these methods require time-consuming training procedures on large manually annotated sample sets. In this paper, we propose a self-learning object detection framework to resolve the multi-scene detection problem in a bottom-up manner. A scene-specific objector is obtained from an autonomous learning process triggered by marking several bounding boxes around an object in the first video frame via a mouse. Here, artificially labeled training data or generic detectors are not needed. This learning process is conveniently replicated many times in different surveillance scenarios and produces scene-specific detectors from various camera viewpoints. Obviously, the initial scene-specific detector, initialized by several bounding boxes, exhibits poor detection performance and is difficult to be improved by traditional online learning algorithms. Consequently, we propose the Generative-Discriminative model (GDM) based detection method to partition detection response space and assign each partition an individual descriptor that progressively achieves high classification accuracy. Online gradual optimization process is proposed to optimize the Generative-Discriminative model and focus on those hard samples lying near the decision boundary. Experimental results on nine video datasets show that our approach achieves comparable performance to that of robust supervised methods, and outperforms state-of-the-art scene-specific object detection methods under varying imaging conditions.



中文翻译:

基于最小监督的生成判别模型学习特定场景的目标检测器

由于多场景对象检测任务中的不同照明、背景和相机视点,一个对象类别可能会显示出很大的变化。传统的目标检测方法通常在不受约束的视频环境下表现不佳。为了解决这个问题,许多现代方法为对象检测提供了深度层次的外观表示。这些方法中的大多数都需要对大型手动注释样本集进行耗时的训练程序。在本文中,我们提出了一种自学习对象检测框架,以自下而上的方式解决多场景检测问题。通过通过鼠标在第一视频帧中标记对象周围的几个边界框触发的自主学习过程获得场景特定对象。这里,不需要人工标记的训练数据或通用检测器。这种学习过程可以方便地在不同的监控场景中多次复制,并从各种摄像机视点生成特定场景的检测器。显然,由多个边界框初始化的初始场景特定检测器的检测性能较差,传统在线学习算法难以改进。因此,我们提出了基于生成判别模型(GDM)的检测方法来划分检测响应空间,并为每个分区分配一个单独的描述符,逐步实现高分类精度。提出了在线渐进优化过程来优化Generative-Discriminative模型,并关注那些位于决策边界附近的硬样本。

更新日期:2022-05-10
down
wechat
bug