当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Improving Classification Power for One-Shot Object Detection
Neurocomputing ( IF 6 ) Pub Date : 2021-05-04 , DOI: 10.1016/j.neucom.2021.04.116
Hanqing Yang , Yongliang Lin , Hong Zhang , Yu Zhang , Bin Xu

Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.



中文翻译:

致力于提高一发式物体检测的分类能力

基于深度学习的对象检测通常依赖于大量的训练数据,这在准备过程中可能非常耗时。在本文中,我们尝试通过解决“一键式对象检测”(OSOD)任务来解决该问题。给定表示为查询图像的新颖图像,其类别标签未包含在训练数据中,OSOD旨在检测表示为目标图像的复杂场景中相同类别的对象。最新的OSOD方法的性能比一般的对象检测要弱得多。我们发现,这种性能受限的原因之一是生成了更多的误报(即错误检测)。因此,我们认为重要的是减少OSOD任务中产生的误报次数以提高性能。为此,我们提出了一种专注于分类的一站式目标检测(FOC OSOD)网络。具体而言,我们从两个角度设计网络:(1)如何获得查询图像和目标图像之间的有效相似性特征;(2)如何有效地对相似性特征进行分类。为了解决上述两个挑战,首先,我们提出了一个分类特征和注意分类(CFDA)模块,以获取高质量的查询特征和目标特征,从而可以进一步在它们之间生成有效的相似性特征。其次,我们提出了分割迭代头(SIH),以提高对相似性特征进行分类的能力。在两个公共数据集上进行了广泛的实验(例如,

更新日期:2021-05-04
down
wechat
bug